Search Results: "uli"

12 February 2024

Freexian Collaborators: Monthly report about Debian Long Term Support, January 2024 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In January, 16 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 14.0h (out of 7.0h assigned and 7.0h from previous period).
  • Bastien Roucari s did 22.0h (out of 16.0h assigned and 6.0h from previous period).
  • Ben Hutchings did 14.5h (out of 8.0h assigned and 16.0h from previous period), thus carrying over 9.5h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Daniel Leidert did 10.0h (out of 10.0h assigned).
  • Emilio Pozuelo Monfort did 10.0h (out of 14.75h assigned and 27.0h from previous period), thus carrying over 31.75h to the next month.
  • Guilhem Moulin did 9.75h (out of 25.0h assigned), thus carrying over 15.25h to the next month.
  • Holger Levsen did 3.5h (out of 12.0h assigned), thus carrying over 8.5h to the next month.
  • Markus Koschany did 40.0h (out of 40.0h assigned).
  • Roberto C. S nchez did 8.75h (out of 9.5h assigned and 2.5h from previous period), thus carrying over 3.25h to the next month.
  • Santiago Ruano Rinc n did 13.5h (out of 8.25h assigned and 7.75h from previous period), thus carrying over 2.5h to the next month.
  • Sean Whitton did 0.5h (out of 0.25h assigned and 5.75h from previous period), thus carrying over 5.5h to the next month.
  • Sylvain Beucler did 9.5h (out of 23.25h assigned and 18.5h from previous period), thus carrying over 32.25h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Tobias Frost did 12.0h (out of 10.25h assigned and 1.75h from previous period).
  • Utkarsh Gupta did 8.5h (out of 35.75h assigned), thus carrying over 24.75h to the next month.

Evolution of the situation In January, we have released 25 DLAs. A variety of particularly notable packages were updated during January. Among those updates were the Linux kernel (both versions 5.10 and 4.19), mariadb-10.3, openjdk-11, firefox-esr, and thunderbird. In addition to the many other LTS package updates which were released in January, LTS contributors continue their efforts to make impactful contributions both within the Debian community.

Thanks to our sponsors Sponsors that joined recently are in bold.

7 February 2024

Reproducible Builds: Reproducible Builds in January 2024

Welcome to the January 2024 report from the Reproducible Builds project. In these reports we outline the most important things that we have been up to over the past month. If you are interested in contributing to the project, please visit our Contribute page on our website.

How we executed a critical supply chain attack on PyTorch John Stawinski and Adnan Khan published a lengthy blog post detailing how they executed a supply-chain attack against PyTorch, a popular machine learning platform used by titans like Google, Meta, Boeing, and Lockheed Martin :
Our exploit path resulted in the ability to upload malicious PyTorch releases to GitHub, upload releases to [Amazon Web Services], potentially add code to the main repository branch, backdoor PyTorch dependencies the list goes on. In short, it was bad. Quite bad.
The attack pivoted on PyTorch s use of self-hosted runners as well as submitting a pull request to address a trivial typo in the project s README file to gain access to repository secrets and API keys that could subsequently be used for malicious purposes.

New Arch Linux forensic filesystem tool On our mailing list this month, long-time Reproducible Builds developer kpcyrd announced a new tool designed to forensically analyse Arch Linux filesystem images. Called archlinux-userland-fs-cmp, the tool is supposed to be used from a rescue image (any Linux) with an Arch install mounted to, [for example], /mnt. Crucially, however, at no point is any file from the mounted filesystem eval d or otherwise executed. Parsers are written in a memory safe language. More information about the tool can be found on their announcement message, as well as on the tool s homepage. A GIF of the tool in action is also available.

Issues with our SOURCE_DATE_EPOCH code? Chris Lamb started a thread on our mailing list summarising some potential problems with the source code snippet the Reproducible Builds project has been using to parse the SOURCE_DATE_EPOCH environment variable:
I m not 100% sure who originally wrote this code, but it was probably sometime in the ~2015 era, and it must be in a huge number of codebases by now. Anyway, Alejandro Colomar was working on the shadow security tool and pinged me regarding some potential issues with the code. You can see this conversation here.
Chris ends his message with a request that those with intimate or low-level knowledge of time_t, C types, overflows and the various parsing libraries in the C standard library (etc.) contribute with further info.

Distribution updates In Debian this month, Roland Clobus posted another detailed update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours) . Additionally 7 of the 8 bookworm images from the official download link build reproducibly at any later time. In addition to this, three reviews of Debian packages were added, 17 were updated and 15 were removed this month adding to our knowledge about identified issues. Elsewhere, Bernhard posted another monthly update for his work elsewhere in openSUSE.

Community updates There were made a number of improvements to our website, including Bernhard M. Wiedemann fixing a number of typos of the term nondeterministic . [ ] and Jan Zerebecki adding a substantial and highly welcome section to our page about SOURCE_DATE_EPOCH to document its interaction with distribution rebuilds. [ ].
diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 254 and 255 to Debian but focusing on triaging and/or merging code from other contributors. This included adding support for comparing eXtensible ARchive (.XAR/.PKG) files courtesy of Seth Michael Larson [ ][ ], as well considerable work from Vekhir in order to fix compatibility between various and subtle incompatible versions of the progressbar libraries in Python [ ][ ][ ][ ]. Thanks!

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In January, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Reduce the number of arm64 architecture workers from 24 to 16. [ ]
    • Use diffoscope from the Debian release being tested again. [ ]
    • Improve the handling when killing unwanted processes [ ][ ][ ] and be more verbose about it, too [ ].
    • Don t mark a job as failed if process marked as to-be-killed is already gone. [ ]
    • Display the architecture of builds that have been running for more than 48 hours. [ ]
    • Reboot arm64 nodes when they hit an OOM (out of memory) state. [ ]
  • Package rescheduling changes:
    • Reduce IRC notifications to 1 when rescheduling due to package status changes. [ ]
    • Correctly set SUDO_USER when rescheduling packages. [ ]
    • Automatically reschedule packages regressing to FTBFS (build failure) or FTBR (build success, but unreproducible). [ ]
  • OpenWrt-related changes:
    • Install the python3-dev and python3-pyelftools packages as they are now needed for the sunxi target. [ ][ ]
    • Also install the libpam0g-dev which is needed by some OpenWrt hardware targets. [ ]
  • Misc:
    • As it s January, set the real_year variable to 2024 [ ] and bump various copyright years as well [ ].
    • Fix a large (!) number of spelling mistakes in various scripts. [ ][ ][ ]
    • Prevent Squid and Systemd processes from being killed by the kernel s OOM killer. [ ]
    • Install the iptables tool everywhere, else our custom rc.local script fails. [ ]
    • Cleanup the /srv/workspace/pbuilder directory on boot. [ ]
    • Automatically restart Squid if it fails. [ ]
    • Limit the execution of chroot-installation jobs to a maximum of 4 concurrent runs. [ ][ ]
Significant amounts of node maintenance was performed by Holger Levsen (eg. [ ][ ][ ][ ][ ][ ][ ] etc.) and Vagrant Cascadian (eg. [ ][ ][ ][ ][ ][ ][ ][ ]). Indeed, Vagrant Cascadian handled an extended power outage for the network running the Debian armhf architecture test infrastructure. This provided the incentive to replace the UPS batteries and consolidate infrastructure to reduce future UPS load. [ ] Elsewhere in our infrastructure, however, Holger Levsen also adjusted the email configuration for @reproducible-builds.org to deal with a new SMTP email attack. [ ]

Upstream patches The Reproducible Builds project tries to detects, dissects and fix as many (currently) unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including: Separate to this, Vagrant Cascadian followed up with the relevant maintainers when reproducibility fixes were not included in newly-uploaded versions of the mm-common package in Debian this was quickly fixed, however. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

2 February 2024

Ian Jackson: UPS, the Useless Parcel Service; VAT and fees

I recently had the most astonishingly bad experience with UPS, the courier company. They severely damaged my parcels, and were very bad about UK import VAT, ultimately ending up harassing me on autopilot. The only thing that got their attention was my draft Particulars of Claim for intended legal action. Surprisingly, I got them to admit in writing that the disbursement fee they charge recipients alongside the actual VAT, is just something they made up with no legal basis. What happened Autumn last year I ordered some furniture from a company in Germany. This was to be shipped by them to me by courier. The supplier chose UPS. UPS misrouted one of the three parcels to Denmark. When everything arrived, it had been sat on by elephants. The supplier had to replace most of it, with considerable inconvenience and delay to me, and of course a loss to the supplier. But this post isn t mostly about that. This post is about VAT. You see, import VAT was due, because of fucking Brexit. UPS made a complete hash of collecting that VAT. Their computers can t issue coherent documents, their email helpdesk is completely useless, and their automated debt collection systems run along uninfluenced by any external input. The crazy, including legal threats and escalating late payment fees, continued even after I paid the VAT discrepancy (which I did despite them not yet having provided any coherent calculation for it). This kind of behaviour is a very small and mild version of the kind of things British Gas did to Lisa Ferguson, who eventually won substantial damages for harassment, plus 10K of costs. Having tried asking nicely, and sending stiff letters, I too threatened litigation. I would have actually started a court claim, but it would have included a claim under the Protection from Harassment Act. Those have to be filed under the Part 8 procedure , which involves sending all of the written evidence you re going to use along with the claim form. Collating all that would be a good deal of work, especially since UPS and ControlAccount didn t engage with me at all, so I had no idea which things they might actually dispute. So I decided that before issuing proceedings, I d send them a copy of my draft Particulars of Claim, along with an offer to settle if they would pay me a modest sum and stop being evil robots at me. Rather than me typing the whole tale in again, you can read the full gory details in the PDF of my draft Particulars of Claim. (I ve redacted the reference numbers). Outcome The draft Particulars finally got their attention. UPS sent me an offer: they agreed to pay me 50, in full and final settlement. That was close enough to my offer that I accepted it. I mostly wanted them to stop, and they do seem to have done so. And I ve received the 50. VAT calculation They also finally included an actual explanation of the VAT calculation. It s absurd, but it s not UPS s absurd:
The clearance was entered initially with estimated import charges of 400.03, consisting of 387.83 VAT, and 12.20 disbursement fee. This original entry regrettably did not include the freight cost for calculating the VAT, and as such when submitted for final entry the VAT value was adjusted to include this and an amended invoice was issued for an additional 39.84. HMRC calculate the amount against which VAT is raised using the value of goods, insurance and freight, however they also may apply a VAT adjustment figure. The VAT Adjustment is based on many factors (Incidental costs in regards to a shipment), which includes charge for currency conversion if the invoice does not list values in Sterling, but the main is due to the inland freight from airport of destination to the final delivery point, as this charge varies, for example, from EMA to Edinburgh would be 150, from EMA to Derby would be 1, so each year UPS must supply HMRC with all values incurred for entry build up and they give an average which UPS have to use on the entry build up as the VAT Adjustment. The correct calculation for the import charges is therefore as follows: Goods value divided by exchange rate 2,489.53 EUR / 1.1683 = 2,130.89 GBP Duty: Goods value plus freight (%) 2,130.89 GBP + 5% = 2,237.43 GBP. That total times the duty rate. X 0 % = 0 GBP VAT: Goods value plus freight (100%) 2,130.89 GBP + 0 = 2,130.89 GBP That total plus duty and VAT adjustment 2,130.89 GBP + 0 GBP + 7.49 GBP = 2,348.08 GBP. That total times 20% VAT = 427.67 GBP As detailed above we must confirm that the final VAT charges applied to the shipment were correct, and that no refund of this is therefore due.
This looks very like HMRC-originated nonsense. If only they had put it on the original bills! It s completely ridiculous that it took four months and near-litigation to obtain it. Disbursement fee One more thing. UPS billed me a 12 disbursement fee . When you import something, there s often tax to pay. The courier company pays that to the government, and the consignee pays it to the courier. Usually the courier demands it before final delivery, since otherwise they end up having to chase it as a debt. It is common for parcel companies to add a random fee of their own. As I note in my Particulars, there isn t any legal basis for this. In my own offer of settlement I proposed that UPS should:
State under what principle of English law (such as, what enactment or principle of Common Law), you levy the disbursement fee (or refund it).
To my surprise they actually responded to this in their own settlement letter. (They didn t, for example, mention the harassment at all.) They said (emphasis mine):
A disbursement fee is a fee for amounts paid or processed on behalf of a client. It is an established category of charge used by legal firms, amongst other companies, for billing of various ancillary costs which may be incurred in completion of service. Disbursement fees are not covered by a specific law, nor are they legally prohibited. Regarding UPS disbursement fee this is an administrative charge levied for the use of UPS deferment account to prepay import charges for clearance through CDS. This charge would therefore be billed to the party that is responsible for the import charges, normally the consignee or receiver of the shipment in question. The disbursement fee as applied is legitimate, and as you have stated is a commonly used and recognised charge throughout the courier industry, and I can confirm that this was charged correctly in this instance.
On UPS s analysis, they can just make up whatever fee they like. That is clearly not right (and I don t even need to refer to consumer protection law, which would also make it obviously unlawful). And, that everyone does it doesn t make it lawful. There are so many things that are ubiquitous but unlawful, especially nowadays when much of the legal system - especially consumer protection regulators - has been underfunded to beyond the point of collapse. Next time this comes up I might have a go at getting the fee back. (Obviously I ll have to pay it first, to get my parcel.) ParcelForce and Royal Mail I think this analysis doesn t apply to ParcelForce and (probably) Royal Mail. I looked into this in 2009, and I found that Parcelforce had been given the ability to write their own private laws: Schemes made under section 89 of the Postal Services Act 2000. This is obviously ridiculous but I think it was the law in 2009. I doubt the intervening governments have fixed it. Furniture Oh, yes, the actual furniture. The replacements arrived intact and are great :-).

comment count unavailable comments

30 January 2024

Antoine Beaupr : router archeology: the Soekris net5001

Roadkiller was a Soekris net5501 router I used as my main gateway between 2010 and 2016 (for r seau and t l phone). It was upgraded to FreeBSD 8.4-p12 (2014-06-06) and pkgng. It was retired in favor of octavia around 2016. Roughly 10 years later (2024-01-24), I found it in a drawer and, to my surprised, it booted. After wrangling with a RS-232 USB adapter, a null modem cable, and bit rates, I even logged in:
comBIOS ver. 1.33  20070103  Copyright (C) 2000-2007 Soekris Engineering.
net5501
0512 Mbyte Memory                        CPU Geode LX 500 Mhz 
Pri Mas  WDC WD800VE-00HDT0              LBA Xlt 1024-255-63  78 Gbyte
Slot   Vend Dev  ClassRev Cmd  Stat CL LT HT  Base1    Base2   Int 
-------------------------------------------------------------------
0:01:2 1022 2082 10100000 0006 0220 08 00 00 A0000000 00000000 10
0:06:0 1106 3053 02000096 0117 0210 08 40 00 0000E101 A0004000 11
0:07:0 1106 3053 02000096 0117 0210 08 40 00 0000E201 A0004100 05
0:08:0 1106 3053 02000096 0117 0210 08 40 00 0000E301 A0004200 09
0:09:0 1106 3053 02000096 0117 0210 08 40 00 0000E401 A0004300 12
0:20:0 1022 2090 06010003 0009 02A0 08 40 80 00006001 00006101 
0:20:2 1022 209A 01018001 0005 02A0 08 00 00 00000000 00000000 
0:21:0 1022 2094 0C031002 0006 0230 08 00 80 A0005000 00000000 15
0:21:1 1022 2095 0C032002 0006 0230 08 00 00 A0006000 00000000 15
 4 Seconds to automatic boot.   Press Ctrl-P for entering Monitor.
 
                                            
                                                  ______
                                                    ____  __ ___  ___ 
            Welcome to FreeBSD!                     __   '__/ _ \/ _ \
                                                    __       __/  __/
                                                                      
    1. Boot FreeBSD [default]                     _     _   \___ \___ 
    2. Boot FreeBSD with ACPI enabled             ____   _____ _____
    3. Boot FreeBSD in Safe Mode                    _ \ / ____   __ \
    4. Boot FreeBSD in single user mode             _)   (___         
    5. Boot FreeBSD with verbose logging            _ < \___ \        
    6. Escape to loader prompt                      _)  ____)    __   
    7. Reboot                                                         
                                                  ____/ _____/ _____/
                                            
                                            
                                            
    Select option, [Enter] for default      
    or [Space] to pause timer  5            
  
Copyright (c) 1992-2013 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.4-RELEASE-p12 #5: Fri Jun  6 02:43:23 EDT 2014
    root@roadkiller.anarc.at:/usr/obj/usr/src/sys/ROADKILL i386
gcc version 4.2.2 20070831 prerelease [FreeBSD]
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Geode(TM) Integrated Processor by AMD PCS (499.90-MHz 586-class CPU)
  Origin = "AuthenticAMD"  Id = 0x5a2  Family = 5  Model = a  Stepping = 2
  Features=0x88a93d<FPU,DE,PSE,TSC,MSR,CX8,SEP,PGE,CMOV,CLFLUSH,MMX>
  AMD Features=0xc0400000<MMX+,3DNow!+,3DNow!>
real memory  = 536870912 (512 MB)
avail memory = 506445824 (482 MB)
kbd1 at kbdmux0
K6-family MTRR support enabled (2 registers)
ACPI Error: A valid RSDP was not found (20101013/tbxfroot-309)
ACPI: Table initialisation failed: AE_NOT_FOUND
ACPI: Try disabling either ACPI or apic support.
cryptosoft0: <software crypto> on motherboard
pcib0 pcibus 0 on motherboard
pci0: <PCI bus> on pcib0
Geode LX: Soekris net5501 comBIOS ver. 1.33 20070103 Copyright (C) 2000-2007
pci0: <encrypt/decrypt, entertainment crypto> at device 1.2 (no driver attached)
vr0: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe100-0xe1ff mem 0xa0004000-0xa00040ff irq 11 at device 6.0 on pci0
vr0: Quirks: 0x2
vr0: Revision: 0x96
miibus0: <MII bus> on vr0
ukphy0: <Generic IEEE 802.3u media interface> PHY 1 on miibus0
ukphy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
vr0: Ethernet address: 00:00:24:cc:93:44
vr0: [ITHREAD]
vr1: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe200-0xe2ff mem 0xa0004100-0xa00041ff irq 5 at device 7.0 on pci0
vr1: Quirks: 0x2
vr1: Revision: 0x96
miibus1: <MII bus> on vr1
ukphy1: <Generic IEEE 802.3u media interface> PHY 1 on miibus1
ukphy1:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
vr1: Ethernet address: 00:00:24:cc:93:45
vr1: [ITHREAD]
vr2: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe300-0xe3ff mem 0xa0004200-0xa00042ff irq 9 at device 8.0 on pci0
vr2: Quirks: 0x2
vr2: Revision: 0x96
miibus2: <MII bus> on vr2
ukphy2: <Generic IEEE 802.3u media interface> PHY 1 on miibus2
ukphy2:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
vr2: Ethernet address: 00:00:24:cc:93:46
vr2: [ITHREAD]
vr3: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe400-0xe4ff mem 0xa0004300-0xa00043ff irq 12 at device 9.0 on pci0
vr3: Quirks: 0x2
vr3: Revision: 0x96
miibus3: <MII bus> on vr3
ukphy3: <Generic IEEE 802.3u media interface> PHY 1 on miibus3
ukphy3:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
vr3: Ethernet address: 00:00:24:cc:93:47
vr3: [ITHREAD]
isab0: <PCI-ISA bridge> at device 20.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <AMD CS5536 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe000-0xe00f at device 20.2 on pci0
ata0: <ATA channel> at channel 0 on atapci0
ata0: [ITHREAD]
ata1: <ATA channel> at channel 1 on atapci0
ata1: [ITHREAD]
ohci0: <OHCI (generic) USB controller> mem 0xa0005000-0xa0005fff irq 15 at device 21.0 on pci0
ohci0: [ITHREAD]
usbus0 on ohci0
ehci0: <AMD CS5536 (Geode) USB 2.0 controller> mem 0xa0006000-0xa0006fff irq 15 at device 21.1 on pci0
ehci0: [ITHREAD]
usbus1: EHCI version 1.0
usbus1 on ehci0
cpu0 on motherboard
pmtimer0 on isa0
orm0: <ISA Option ROM> at iomem 0xc8000-0xd27ff pnpid ORM0000 on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
atrtc0: <AT Real Time Clock> at port 0x70 irq 8 on isa0
ppc0: parallel port not found.
uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
uart0: [FILTER]
uart0: console (19200,n,8,1)
uart1: <16550 or compatible> at port 0x2f8-0x2ff irq 3 on isa0
uart1: [FILTER]
Timecounter "TSC" frequency 499903982 Hz quality 800
Timecounters tick every 1.000 msec
IPsec: Initialized Security Association Processing.
usbus0: 12Mbps Full Speed USB v1.0
usbus1: 480Mbps High Speed USB v2.0
ad0: 76319MB <WDC WD800VE-00HDT0 09.07D09> at ata0-master UDMA100 
ugen0.1: <AMD> at usbus0
uhub0: <AMD OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
ugen1.1: <AMD> at usbus1
uhub1: <AMD EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
GEOM: ad0s1: geometry does not match label (255h,63s != 16h,63s).
uhub0: 4 ports with 4 removable, self powered
Root mount waiting for: usbus1
Root mount waiting for: usbus1
uhub1: 4 ports with 4 removable, self powered
Trying to mount root from ufs:/dev/ad0s1a
The last log rotation is from 2016:
[root@roadkiller /var/log]# stat /var/log/wtmp      
65 61783 -rw-r--r-- 1 root wheel 208219 1056 "Nov  1 05:00:01 2016" "Jan 18 22:29:16 2017" "Jan 18 22:29:16 2017" "Nov  1 05:00:01 2016" 16384 4 0 /var/log/wtmp
Interestingly, I switched between eicat and teksavvy on December 11th. Which year? Who knows!
Dec 11 16:38:40 roadkiller mpd: [eicatL0] LCP: authorization successful
Dec 11 16:41:15 roadkiller mpd: [teksavvyL0] LCP: authorization successful
Never realized those good old logs had a "oh dear forgot the year" issue (that's something like Y2K except just "Y", I guess). That was probably 2015, because the log dates from 2017, and the last entry is from November of the year after the above:
[root@roadkiller /var/log]# stat mpd.log 
65 47113 -rw-r--r-- 1 root wheel 193008 71939195 "Jan 18 22:39:18 2017" "Jan 18 22:39:59 2017" "Jan 18 22:39:59 2017" "Apr  2 10:41:37 2013" 16384 140640 0 mpd.log
It looks like the system was installed in 2010:
[root@roadkiller /var/log]# stat /
63 2 drwxr-xr-x 21 root wheel 2120 512 "Jan 18 22:34:43 2017" "Jan 18 22:28:12 2017" "Jan 18 22:28:12 2017" "Jul 18 22:25:00 2010" 16384 4 0 /
... so it lived for about 6 years, but still works after almost 14 years, which I find utterly amazing. Another amazing thing is that there's tuptime installed on that server! That is a software I thought I discovered later and then sponsored in Debian, but turns out I was already using it then!
[root@roadkiller /var]# tuptime 
System startups:        19   since   21:20:16 11/07/15
System shutdowns:       0 ok   -   18 bad
System uptime:          85.93 %   -   1 year, 11 days, 10 hours, 3 minutes and 36 seconds
System downtime:        14.07 %   -   61 days, 15 hours, 22 minutes and 45 seconds
System life:            1 year, 73 days, 1 hour, 26 minutes and 20 seconds
Largest uptime:         122 days, 9 hours, 17 minutes and 6 seconds   from   08:17:56 02/02/16
Shortest uptime:        5 minutes and 4 seconds   from   21:55:00 01/18/17
Average uptime:         19 days, 19 hours, 28 minutes and 37 seconds
Largest downtime:       57 days, 1 hour, 9 minutes and 59 seconds   from   20:45:01 11/22/16
Shortest downtime:      -1 years, 364 days, 23 hours, 58 minutes and 12 seconds   from   22:30:01 01/18/17
Average downtime:       3 days, 5 hours, 51 minutes and 43 seconds
Current uptime:         18 minutes and 23 seconds   since   22:28:13 01/18/17
Actual up/down times:
[root@roadkiller /var]# tuptime -t
No.        Startup Date                                         Uptime       Shutdown Date   End                                                  Downtime
1     21:20:16 11/07/15      1 day, 0 hours, 40 minutes and 12 seconds   22:00:28 11/08/15   BAD                                  2 minutes and 37 seconds
2     22:03:05 11/08/15      1 day, 9 hours, 41 minutes and 57 seconds   07:45:02 11/10/15   BAD                                  3 minutes and 24 seconds
3     07:48:26 11/10/15    20 days, 2 hours, 41 minutes and 34 seconds   10:30:00 11/30/15   BAD                        4 hours, 50 minutes and 21 seconds
4     15:20:21 11/30/15                      19 minutes and 40 seconds   15:40:01 11/30/15   BAD                                   6 minutes and 5 seconds
5     15:46:06 11/30/15                      53 minutes and 55 seconds   16:40:01 11/30/15   BAD                           1 hour, 1 minute and 38 seconds
6     17:41:39 11/30/15     6 days, 16 hours, 3 minutes and 22 seconds   09:45:01 12/07/15   BAD                4 days, 6 hours, 53 minutes and 11 seconds
7     16:38:12 12/11/15   50 days, 17 hours, 56 minutes and 49 seconds   10:35:01 01/31/16   BAD                                 10 minutes and 52 seconds
8     10:45:53 01/31/16     1 day, 21 hours, 28 minutes and 16 seconds   08:14:09 02/02/16   BAD                                  3 minutes and 48 seconds
9     08:17:56 02/02/16    122 days, 9 hours, 17 minutes and 6 seconds   18:35:02 06/03/16   BAD                                 10 minutes and 16 seconds
10    18:45:18 06/03/16   29 days, 17 hours, 14 minutes and 43 seconds   12:00:01 07/03/16   BAD                                 12 minutes and 34 seconds
11    12:12:35 07/03/16   31 days, 17 hours, 17 minutes and 26 seconds   05:30:01 08/04/16   BAD                                 14 minutes and 25 seconds
12    05:44:26 08/04/16     15 days, 1 hour, 55 minutes and 35 seconds   07:40:01 08/19/16   BAD                                  6 minutes and 51 seconds
13    07:46:52 08/19/16     7 days, 5 hours, 23 minutes and 10 seconds   13:10:02 08/26/16   BAD                                  3 minutes and 45 seconds
14    13:13:47 08/26/16   27 days, 21 hours, 36 minutes and 14 seconds   10:50:01 09/23/16   BAD                                  2 minutes and 14 seconds
15    10:52:15 09/23/16   60 days, 10 hours, 52 minutes and 46 seconds   20:45:01 11/22/16   BAD                 57 days, 1 hour, 9 minutes and 59 seconds
16    21:55:00 01/18/17                        5 minutes and 4 seconds   22:00:04 01/18/17   BAD                                 11 minutes and 15 seconds
17    22:11:19 01/18/17                       8 minutes and 42 seconds   22:20:01 01/18/17   BAD                                   1 minute and 20 seconds
18    22:21:21 01/18/17                       8 minutes and 40 seconds   22:30:01 01/18/17   BAD   -1 years, 364 days, 23 hours, 58 minutes and 12 seconds
19    22:28:13 01/18/17                      20 minutes and 17 seconds
The last few entries are actually the tests I'm running now, it seems this machine thinks we're now on 2017-01-18 at ~22:00, while we're actually 2024-01-24 at ~12:00 local:
Wed Jan 18 23:05:38 EST 2017
FreeBSD/i386 (roadkiller.anarc.at) (ttyu0)
login: root
Password:
Jan 18 23:07:10 roadkiller login: ROOT LOGIN (root) ON ttyu0
Last login: Wed Jan 18 22:29:16 on ttyu0
Copyright (c) 1992-2013 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 8.4-RELEASE-p12 (ROADKILL) #5: Fri Jun  6 02:43:23 EDT 2014
Reminders:
 * commit stuff in /etc
 * reload firewall (in screen!):
    pfctl -f /etc/pf.conf ; sleep 1
 * vim + syn on makes pf.conf more readable
 * monitoring the PPPoE uplink:
   tail -f /var/log/mpd.log
Current problems:
 * sometimes pf doesn't start properly on boot, if pppoe failed to come up, use
   this to resume:
     /etc/rc.d/pf start
   it will kill your shell, but fix NAT (2012-08-10)
 * babel fails to start on boot (2013-06-15):
     babeld -D -g 33123 tap0 vr3
 * DNS often fails, tried messing with unbound.conf (2014-10-05) and updating
   named.root (2016-01-28) and performance tweaks (ee63689)
 * asterisk and mpd4 are deprecated and should be uninstalled when we're sure
   their replacements (voipms + ata and mpd5) are working (2015-01-13)
 * if IPv6 fails, it's because netblocks are not being routed upstream. DHCPcd
   should do this, but doesn't start properly, use this to resume (2015-12-21):
     /usr/local/sbin/dhcpcd -6 --persistent --background --timeout 0 -C resolv.conf ng0
This machine is doomed to be replaced with the new omnia router, Indiegogo
campaign should ship in april 2016: http://igg.me/at/turris-omnia/x
(I really like the motd I left myself there. In theory, I guess this could just start connecting to the internet again if I still had the same PPPoE/ADSL link I had almost a decade ago; obviously, I do not.) Not sure how the system figured the 2017 time: the onboard clock itself believes we're in 1980, so clearly the CMOS battery has (understandably) failed:
> ?
comBIOS Monitor Commands
boot [drive][:partition] INT19 Boot
reboot                   cold boot
download                 download a file using XMODEM/CRC
flashupdate              update flash BIOS with downloaded file
time [HH:MM:SS]          show or set time
date [YYYY/MM/DD]        show or set date
d[b w d] [adr]           dump memory bytes/words/dwords
e[b w d] adr value [...] enter bytes/words/dwords
i[b w d] port            input from 8/16/32-bit port
o[b w d] port value      output to 8/16/32-bit port
run adr                  execute code at adr
cmosread [adr]           read CMOS RAM data
cmoswrite adr byte [...] write CMOS RAM data
cmoschecksum             update CMOS RAM Checksum
set parameter=value      set system parameter to value
show [parameter]         show one or all system parameters
?/help                   show this help
> show
ConSpeed = 19200
ConLock = Enabled
ConMute = Disabled
BIOSentry = Enabled
PCIROMS = Enabled
PXEBoot = Enabled
FLASH = Primary
BootDelay = 5
FastBoot = Disabled
BootPartition = Disabled
BootDrive = 80 81 F0 FF 
ShowPCI = Enabled
Reset = Hard
CpuSpeed = Default
> time
Current Date and Time is: 1980/01/01 00:56:47
Another bit of archeology: I had documented various outages with my ISP... back in 2003!
[root@roadkiller ~/bin]# cat ppp_stats/downtimes.txt
11/03/2003 18:24:49 218
12/03/2003 09:10:49 118
12/03/2003 10:05:57 680
12/03/2003 10:14:50 106
12/03/2003 10:16:53 6
12/03/2003 10:35:28 146
12/03/2003 10:57:26 393
12/03/2003 11:16:35 5
12/03/2003 11:16:54 11
13/03/2003 06:15:57 18928
13/03/2003 09:43:36 9730
13/03/2003 10:47:10 23
13/03/2003 10:58:35 5
16/03/2003 01:32:36 338
16/03/2003 02:00:33 120
16/03/2003 11:14:31 14007
19/03/2003 00:56:27 11179
19/03/2003 00:56:43 5
19/03/2003 00:56:53 0
19/03/2003 00:56:55 1
19/03/2003 00:57:09 1
19/03/2003 00:57:10 1
19/03/2003 00:57:24 1
19/03/2003 00:57:25 1
19/03/2003 00:57:39 1
19/03/2003 00:57:40 1
19/03/2003 00:57:44 3
19/03/2003 00:57:53 0
19/03/2003 00:57:55 0
19/03/2003 00:58:08 0
19/03/2003 00:58:10 0
19/03/2003 00:58:23 0
19/03/2003 00:58:25 0
19/03/2003 00:58:39 1
19/03/2003 00:58:42 2
19/03/2003 00:58:58 5
19/03/2003 00:59:35 2
19/03/2003 00:59:47 3
19/03/2003 01:00:34 3
19/03/2003 01:00:39 0
19/03/2003 01:00:54 0
19/03/2003 01:01:11 2
19/03/2003 01:01:25 1
19/03/2003 01:01:48 1
19/03/2003 01:02:03 1
19/03/2003 01:02:10 2
19/03/2003 01:02:20 3
19/03/2003 01:02:44 3
19/03/2003 01:03:45 3
19/03/2003 01:04:39 2
19/03/2003 01:05:40 2
19/03/2003 01:06:35 2
19/03/2003 01:07:36 2
19/03/2003 01:08:31 2
19/03/2003 01:08:38 2
19/03/2003 01:10:07 3
19/03/2003 01:11:05 2
19/03/2003 01:12:03 3
19/03/2003 01:13:01 3
19/03/2003 01:13:58 2
19/03/2003 01:14:59 5
19/03/2003 01:15:54 2
19/03/2003 01:16:55 2
19/03/2003 01:17:50 2
19/03/2003 01:18:51 3
19/03/2003 01:19:46 2
19/03/2003 01:20:46 2
19/03/2003 01:21:42 3
19/03/2003 01:22:42 3
19/03/2003 01:23:37 2
19/03/2003 01:24:38 3
19/03/2003 01:25:33 2
19/03/2003 01:26:33 2
19/03/2003 01:27:30 3
19/03/2003 01:28:55 2
19/03/2003 01:29:56 2
19/03/2003 01:30:50 2
19/03/2003 01:31:42 3
19/03/2003 01:32:36 3
19/03/2003 01:33:27 2
19/03/2003 01:34:21 2
19/03/2003 01:35:22 2
19/03/2003 01:36:17 3
19/03/2003 01:37:18 2
19/03/2003 01:38:13 3
19/03/2003 01:39:39 2
19/03/2003 01:40:39 2
19/03/2003 01:41:35 3
19/03/2003 01:42:35 3
19/03/2003 01:43:31 3
19/03/2003 01:44:31 3
19/03/2003 01:45:53 3
19/03/2003 01:46:48 3
19/03/2003 01:47:48 2
19/03/2003 01:48:44 3
19/03/2003 01:49:44 2
19/03/2003 01:50:40 3
19/03/2003 01:51:39 1
19/03/2003 11:04:33 19   
19/03/2003 18:39:36 2833 
19/03/2003 18:54:05 825  
19/03/2003 19:04:00 454  
19/03/2003 19:08:11 210  
19/03/2003 19:41:44 272  
19/03/2003 21:18:41 208  
24/03/2003 04:51:16 6
27/03/2003 04:51:20 5
30/03/2003 04:51:25 5
31/03/2003 08:30:31 255  
03/04/2003 08:30:36 5
06/04/2003 01:16:00 621  
06/04/2003 22:18:08 17   
06/04/2003 22:32:44 13   
09/04/2003 22:33:12 28   
12/04/2003 22:33:17 6
15/04/2003 22:33:22 5
17/04/2003 15:03:43 18   
20/04/2003 15:03:48 5
23/04/2003 15:04:04 16   
23/04/2003 21:08:30 339  
23/04/2003 21:18:08 13   
23/04/2003 23:34:20 253  
26/04/2003 23:34:45 25   
29/04/2003 23:34:49 5
02/05/2003 13:10:01 185  
05/05/2003 13:10:06 5
08/05/2003 13:10:11 5
09/05/2003 14:00:36 63928
09/05/2003 16:58:52 2
11/05/2003 23:08:48 2
14/05/2003 23:08:53 6
17/05/2003 23:08:58 5
20/05/2003 23:09:03 5
23/05/2003 23:09:08 5
26/05/2003 23:09:14 5
29/05/2003 23:00:10 3
29/05/2003 23:03:01 10   
01/06/2003 23:03:05 4
04/06/2003 23:03:10 5
07/06/2003 23:03:38 28   
10/06/2003 23:03:50 12   
13/06/2003 23:03:55 6
14/06/2003 07:42:20 3
14/06/2003 14:37:08 3
15/06/2003 20:08:34 3
18/06/2003 20:08:39 6
21/06/2003 20:08:45 6
22/06/2003 03:05:19 138  
22/06/2003 04:06:28 3
25/06/2003 04:06:58 31   
28/06/2003 04:07:02 4
01/07/2003 04:07:06 4
04/07/2003 04:07:11 5
07/07/2003 04:07:16 5
12/07/2003 04:55:20 6
12/07/2003 19:09:51 1158 
12/07/2003 22:14:49 8025 
15/07/2003 22:14:54 6
16/07/2003 05:43:06 18   
19/07/2003 05:43:12 6
22/07/2003 05:43:17 5
23/07/2003 18:18:55 183  
23/07/2003 18:19:55 9
23/07/2003 18:29:15 158  
23/07/2003 19:48:44 4604 
23/07/2003 20:16:27 3
23/07/2003 20:37:29 1079 
23/07/2003 20:43:12 342  
23/07/2003 22:25:51 6158
Fascinating. I suspect the (IDE!) hard drive might be failing as I saw two new files created in /var that I didn't remember seeing before:
-rw-r--r--   1 root    wheel        0 Jan 18 22:55 3@T3
-rw-r--r--   1 root    wheel        0 Jan 18 22:55 DY5
So I shutdown the machine, possibly for the last time:
Waiting (max 60 seconds) for system process  bufdaemon' to stop...done
Waiting (max 60 seconds) for system process  syncer' to stop...
Syncing disks, vnodes remaining...3 3 0 1 1 0 0 done
All buffers synced.
Uptime: 36m43s
usbus0: Controller shutdown
uhub0: at usbus0, port 1, addr 1 (disconnected)
usbus0: Controller shutdown complete
usbus1: Controller shutdown
uhub1: at usbus1, port 1, addr 1 (disconnected)
usbus1: Controller shutdown complete
The operating system has halted.
Please press any key to reboot.
I'll finally note this was the last FreeBSD server I personally operated. I also used FreeBSD to setup the core routers at Koumbit but those were replaced with Debian recently as well. Thanks Soekris, that was some sturdy hardware. Hopefully this new Protectli router will live up to that "decade plus" challenge. Not sure what the fate of this device will be: I'll bring it to the next Montreal Debian & Stuff to see if anyone's interested, contact me if you can't show up and want this thing.

Matthew Palmer: Why Certificate Lifecycle Automation Matters

If you ve perused the ActivityPub feed of certificates whose keys are known to be compromised, and clicked on the Show More button to see the name of the certificate issuer, you may have noticed that some issuers seem to come up again and again. This might make sense after all, if a CA is issuing a large volume of certificates, they ll be seen more often in a list of compromised certificates. In an attempt to see if there is anything that we can learn from this data, though, I did a bit of digging, and came up with some illuminating results.

The Procedure I started off by finding all the unexpired certificates logged in Certificate Transparency (CT) logs that have a key that is in the pwnedkeys database as having been publicly disclosed. From this list of certificates, I removed duplicates by matching up issuer/serial number tuples, and then reduced the set by counting the number of unique certificates by their issuer. This gave me a list of the issuers of these certificates, which looks a bit like this:
/C=BE/O=GlobalSign nv-sa/CN=AlphaSSL CA - SHA256 - G4
/C=GB/ST=Greater Manchester/L=Salford/O=Sectigo Limited/CN=Sectigo RSA Domain Validation Secure Server CA
/C=GB/ST=Greater Manchester/L=Salford/O=Sectigo Limited/CN=Sectigo RSA Organization Validation Secure Server CA
/C=US/ST=Arizona/L=Scottsdale/O=GoDaddy.com, Inc./OU=http://certs.godaddy.com/repository//CN=Go Daddy Secure Certificate Authority - G2
/C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./OU=http://certs.starfieldtech.com/repository//CN=Starfield Secure Certificate Authority - G2
/C=AT/O=ZeroSSL/CN=ZeroSSL RSA Domain Secure Site CA
/C=BE/O=GlobalSign nv-sa/CN=GlobalSign GCC R3 DV TLS CA 2020
Rather than try to work with raw issuers (because, as Andrew Ayer says, The SSL Certificate Issuer Field is a Lie), I mapped these issuers to the organisations that manage them, and summed the counts for those grouped issuers together.

The Data
Lieutenant Commander Data from Star Trek: The Next Generation Insert obligatory "not THAT data" comment here
The end result of this work is the following table, sorted by the count of certificates which have been compromised by exposing their private key:
IssuerCompromised Count
Sectigo170
ISRG (Let's Encrypt)161
GoDaddy141
DigiCert81
GlobalSign46
Entrust3
SSL.com1
If you re familiar with the CA ecosystem, you ll probably recognise that the organisations with large numbers of compromised certificates are also those who issue a lot of certificates. So far, nothing particularly surprising, then. Let s look more closely at the relationships, though, to see if we can get more useful insights.

Volume Control Using the issuance volume report from crt.sh, we can compare issuance volumes to compromise counts, to come up with a compromise rate . I m using the Unexpired Precertificates colume from the issuance volume report, as I feel that s the number that best matches the certificate population I m examining to find compromised certificates. To maintain parity with the previous table, this one is still sorted by the count of certificates that have been compromised.
IssuerIssuance VolumeCompromised CountCompromise Rate
Sectigo88,323,0681701 in 519,547
ISRG (Let's Encrypt)315,476,4021611 in 1,959,480
GoDaddy56,121,4291411 in 398,024
DigiCert144,713,475811 in 1,786,586
GlobalSign1,438,485461 in 31,271
Entrust23,16631 in 7,722
SSL.com171,81611 in 171,816
If we now sort this table by compromise rate, we can see which organisations have the most (and least) leakiness going on from their customers:
IssuerIssuance VolumeCompromised CountCompromise Rate
Entrust23,16631 in 7,722
GlobalSign1,438,485461 in 31,271
SSL.com171,81611 in 171,816
GoDaddy56,121,4291411 in 398,024
Sectigo88,323,0681701 in 519,547
DigiCert144,713,475811 in 1,786,586
ISRG (Let's Encrypt)315,476,4021611 in 1,959,480
By grouping by order-of-magnitude in the compromise rate, we can identify three bands :
  • The Super Leakers: Customers of Entrust and GlobalSign seem to love to lose control of their private keys. For Entrust, at least, though, the small volumes involved make the numbers somewhat untrustworthy. The three compromised certificates could very well belong to just one customer, for instance. I m not aware of anything that GlobalSign does that would make them such an outlier, either, so I m inclined to think they just got unlucky with one or two customers, but as CAs don t include customer IDs in the certificates they issue, it s not possible to say whether that s the actual cause or not.
  • The Regular Leakers: Customers of SSL.com, GoDaddy, and Sectigo all have compromise rates in the 1-in-hundreds-of-thousands range. Again, the low volumes of SSL.com make the numbers somewhat unreliable, but the other two organisations in this group have large enough numbers that we can rely on that data fairly well, I think.
  • The Low Leakers: Customers of DigiCert and Let s Encrypt are at least three times less likely than customers of the regular leakers to lose control of their private keys. Good for them!
Now we have some useful insights we can think about.

Why Is It So?
Professor Julius Sumner Miller If you don't know who Professor Julius Sumner Miller is, I highly recommend finding out
All of the organisations on the list, with the exception of Let s Encrypt, are what one might term traditional CAs. To a first approximation, it s reasonable to assume that the vast majority of the customers of these traditional CAs probably manage their certificates the same way they have for the past two decades or more. That is, they generate a key and CSR, upload the CSR to the CA to get a certificate, then copy the cert and key somewhere. Since humans are handling the keys, there s a higher risk of the humans using either risky practices, or making a mistake, and exposing the private key to the world. Let s Encrypt, on the other hand, issues all of its certificates using the ACME (Automatic Certificate Management Environment) protocol, and all of the Let s Encrypt documentation encourages the use of software tools to generate keys, issue certificates, and install them for use. Given that Let s Encrypt has 161 compromised certificates currently in the wild, it s clear that the automation in use is far from perfect, but the significantly lower compromise rate suggests to me that lifecycle automation at least reduces the rate of key compromise, even though it doesn t eliminate it completely.

Explaining the Outlier The difference in presumed issuance practices would seem to explain the significant difference in compromise rates between Let s Encrypt and the other organisations, if it weren t for one outlier. This is a largely traditional CA, with the manual-handling issues that implies, but with a compromise rate close to that of Let s Encrypt. We are, of course, talking about DigiCert. The thing about DigiCert, that doesn t show up in the raw numbers from crt.sh, is that DigiCert manages the issuance of certificates for several of the biggest hosted TLS providers, such as CloudFlare and AWS. When these services obtain a certificate from DigiCert on their customer s behalf, the private key is kept locked away, and no human can (we hope) get access to the private key. This is supported by the fact that no certificates identifiably issued to either CloudFlare or AWS appear in the set of certificates with compromised keys. When we ask for all certificates issued by DigiCert , we get both the certificates issued to these big providers, which are very good at keeping their keys under control, as well as the certificates issued to everyone else, whose key handling practices may not be quite so stringent. It s possible, though not trivial, to account for certificates issued to these hosted TLS providers, because the certificates they use are issued from intermediates branded to those companies. With the crt.sh psql interface we can run this query to get the total number of unexpired precertificates issued to these managed services:
SELECT SUM(sub.NUM_ISSUED[2] - sub.NUM_EXPIRED[2])
  FROM (
    SELECT ca.name, max(coalesce(coalesce(nullif(trim(cc.SUBORDINATE_CA_OWNER), ''), nullif(trim(cc.CA_OWNER), '')), cc.INCLUDED_CERTIFICATE_OWNER)) as OWNER,
           ca.NUM_ISSUED, ca.NUM_EXPIRED
      FROM ccadb_certificate cc, ca_certificate cac, ca
     WHERE cc.CERTIFICATE_ID = cac.CERTIFICATE_ID
       AND cac.CA_ID = ca.ID
  GROUP BY ca.ID
  ) sub
 WHERE sub.name ILIKE '%Amazon%' OR sub.name ILIKE '%CloudFlare%' AND sub.owner = 'DigiCert';
The number I get from running that query is 104,316,112, which should be subtracted from DigiCert s total issuance figures to get a more accurate view of what DigiCert s regular customers do with their private keys. When I do this, the compromise rates table, sorted by the compromise rate, looks like this:
IssuerIssuance VolumeCompromised CountCompromise Rate
Entrust23,16631 in 7,722
GlobalSign1,438,485461 in 31,271
SSL.com171,81611 in 171,816
GoDaddy56,121,4291411 in 398,024
"Regular" DigiCert40,397,363811 in 498,732
Sectigo88,323,0681701 in 519,547
All DigiCert144,713,475811 in 1,786,586
ISRG (Let's Encrypt)315,476,4021611 in 1,959,480
In short, it appears that DigiCert s regular customers are just as likely as GoDaddy or Sectigo customers to expose their private keys.

What Does It All Mean? The takeaway from all this is fairly straightforward, and not overly surprising, I believe.

The less humans have to do with certificate issuance, the less likely they are to compromise that certificate by exposing the private key. While it may not be surprising, it is nice to have some empirical evidence to back up the common wisdom. Fully-managed TLS providers, such as CloudFlare, AWS Certificate Manager, and whatever Azure s thing is called, is the platonic ideal of this principle: never give humans any opportunity to expose a private key. I m not saying you should use one of these providers, but the security approach they have adopted appears to be the optimal one, and should be emulated universally. The ACME protocol is the next best, in that there are a variety of standardised tools widely available that allow humans to take themselves out of the loop, but it s still possible for humans to handle (and mistakenly expose) key material if they try hard enough. Legacy issuance methods, which either cannot be automated, or require custom, per-provider automation to be developed, appear to be at least four times less helpful to the goal of avoiding compromise of the private key associated with a certificate.

Humans Are, Of Course, The Problem
Bender, the robot from Futurama, asking if we'd like to kill all humans No thanks, Bender, I'm busy tonight
This observation that if you don t let humans near keys, they don t get leaked is further supported by considering the biggest issuers by volume who have not issued any certificates whose keys have been compromised: Google Trust Services (fourth largest issuer overall, with 57,084,529 unexpired precertificates), and Microsoft Corporation (sixth largest issuer overall, with 22,852,468 unexpired precertificates). It appears that somewhere between most and basically all of the certificates these organisations issue are to customers of their public clouds, and my understanding is that the keys for these certificates are managed in same manner as CloudFlare and AWS the keys are locked away where humans can t get to them. It should, of course, go without saying that if a human can never have access to a private key, it makes it rather difficult for a human to expose it. More broadly, if you are building something that handles sensitive or secret data, the more you can do to keep humans out of the loop, the better everything will be.

Your Support is Appreciated If you d like to see more analysis of how key compromise happens, and the lessons we can learn from examining billions of certificates, please show your support by buying me a refreshing beverage. Trawling CT logs is thirsty work.

Appendix: Methodology Limitations In the interests of clarity, I feel it s important to describe ways in which my research might be flawed. Here are the things I know of that may have impacted the accuracy, that I couldn t feasibly account for.
  • Time Periods: Because time never stops, there is likely to be some slight mismatches in the numbers obtained from the various data sources, because they weren t collected at exactly the same moment.
  • Issuer-to-Organisation Mapping: It s possible that the way I mapped issuers to organisations doesn t match exactly with how crt.sh does it, meaning that counts might be skewed. I tried to minimise that by using the same data sources (the CCADB AllCertificates report) that I believe that crt.sh uses for its mapping, but I cannot be certain of a perfect match.
  • Unwarranted Grouping: I ve drawn some conclusions about the practices of the various organisations based on their general approach to certificate issuance. If a particular subordinate CA that I ve grouped into the parent organisation is managed in some unusual way, that might cause my conclusions to be erroneous. I was able to fairly easily separate out CloudFlare, AWS, and Azure, but there are almost certainly others that I didn t spot, because hoo boy there are a lot of intermediate CAs out there.

12 January 2024

Freexian Collaborators: Monthly report about Debian Long Term Support, December 2023 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In December, 18 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 7.0h (out of 7.0h assigned and 7.0h from previous period), thus carrying over 7.0h to the next month.
  • Adrian Bunk did 16.0h (out of 26.25h assigned and 8.75h from previous period), thus carrying over 19.0h to the next month.
  • Bastien Roucari s did 16.0h (out of 16.0h assigned and 4.0h from previous period), thus carrying over 4.0h to the next month.
  • Ben Hutchings did 8.0h (out of 7.25h assigned and 16.75h from previous period), thus carrying over 16.0h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Emilio Pozuelo Monfort did 8.0h (out of 26.75h assigned and 8.25h from previous period), thus carrying over 27.0h to the next month.
  • Guilhem Moulin did 25.0h (out of 18.0h assigned and 7.0h from previous period).
  • Holger Levsen did 5.5h (out of 5.5h assigned).
  • Jochen Sprickerhof did 0.0h (out of 0h assigned and 10.0h from previous period), thus carrying over 10.0h to the next month.
  • Lee Garrett did 0.0h (out of 25.75h assigned and 9.25h from previous period), thus carrying over 35.0h to the next month.
  • Markus Koschany did 35.0h (out of 35.0h assigned).
  • Roberto C. S nchez did 9.5h (out of 5.5h assigned and 6.5h from previous period), thus carrying over 2.5h to the next month.
  • Santiago Ruano Rinc n did 8.255h (out of 3.26h assigned and 12.745h from previous period), thus carrying over 7.75h to the next month.
  • Sean Whitton did 4.25h (out of 3.25h assigned and 6.75h from previous period), thus carrying over 5.75h to the next month.
  • Sylvain Beucler did 16.5h (out of 21.25h assigned and 13.75h from previous period), thus carrying over 18.5h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Tobias Frost did 10.25h (out of 12.0h assigned), thus carrying over 1.75h to the next month.
  • Utkarsh Gupta did 18.75h (out of 11.25h assigned and 13.5h from previous period), thus carrying over 6.0h to the next month.

Evolution of the situation In December, we have released 29 DLAs. A particularly notable update in December was prepared by LTS contributor Santiago Ruano Rinc n for the openssh package. The updated produced DLA-3694-1 and included a fix for the Terrapin Attack (CVE-2023-48795), which was a rather serious flaw in the SSH protocol itself. The package bluez was the subject of another notable update by LTS contributor Chris Lamb, which resulted in DLA-3689-1 to address an insecure default configuration which allowed attackers to inject keyboard commands over Bluetooth without first authenticating. The LTS team continues its efforts to have a positive impact beyond the boundaries of LTS. Several contributors worked on packages, preparing LTS updates, but also preparing patches or full updates which were uploaded to the unstable, stable, and oldstable distributions, including: Guilhem Moulin s update of tinyxml (uploads to LTS and unstable and patches submitted to the security team for stable and oldstable); Guilhem Moulin s update of xerces-c (uploads to LTS and unstable and patches submitted to the security team for oldstable); Thorsten Alteholz s update of libde265 (uploads to LTS and stable and additional patches submitted to the maintainer for stable and oldstable); Thorsten Alteholz s update of cjson (upload to LTS and patches submitted to the maintainer for stable and oldstable); and Tobias Frost s update of opendkim (sponsor maintainer-prepared upload to LTS and additionally prepared updates for stable and oldstable). Going beyond Debian and looking to the broader community, LTS contributor Bastien Roucari s was contacted by SUSE concerning an update he had prepared for zbar. He was able to assist by coordinating with the former organization of the original zbar author to secure for SUSE access to information concerning the exploits. This has enabled another distribution to benefit from the work done in support of LTS and from the assistance of Bastien in coordinating the access to information. Finally, LTS contributor Santiago Ruano Rinc n continued work relating to how updates for packages in statically-linked language ecosystems (e.g., Go, Rust, and others) are handled. The work is presently focused on more accurately and reliably identifying which packages are impacted in a given update scenario to enable notifications to be published so that users will be made aware of these situations as they occur. As the work continues, it will eventually result in improvements to Debian infrustructure so that the LTS team and Security team are able to manage updates of this nature in a more consistent way.

Thanks to our sponsors Sponsors that joined recently are in bold.

11 January 2024

Matthias Klumpp: Wayland really breaks things Just for now?

This post is in part a response to an aspect of Nate s post Does Wayland really break everything? , but also my reflection on discussing Wayland protocol additions, a unique pleasure that I have been involved with for the past months1.

Some facts Before I start I want to make a few things clear: The Linux desktop will be moving to Wayland2 this is a fact at this point (and has been for a while), sticking to X11 makes no sense for future projects. From reading Wayland protocols and working with it at a much lower level than I ever wanted to, it is also very clear to me that Wayland is an exceptionally well-designed core protocol, and so are the additional extension protocols (xdg-shell & Co.). The modularity of Wayland is great, it gives it incredible flexibility and will for sure turn out to be good for the long-term viability of this project (and also provides a path to correct protocol issues in future, if one is found). In other words: Wayland is an amazing foundation to build on, and a lot of its design decisions make a lot of sense! The shift towards people seeing Linux more as an application developer platform, and taking PipeWire and XDG Portals into account when designing for Wayland is also an amazing development and I love to see this this holistic approach is something I always wanted! Furthermore, I think Wayland removes a lot of functionality that shouldn t exist in a modern compositor and that s a good thing too! Some of X11 s features and design decisions had clear drawbacks that we shouldn t replicate. I highly recommend to read Nate s blog post, it s very good and goes into more detail. And due to all of this, I firmly believe that any advancement in the Wayland space must come from within the project.

But! But! Of course there was a but coming  I think while developing Wayland-as-an-ecosystem we are now entrenched into narrow concepts of how a desktop should work. While discussing Wayland protocol additions, a lot of concepts clash, people from different desktops with different design philosophies debate the merits of those over and over again never reaching any conclusion (just as you will never get an answer out of humans whether sushi or pizza is the clearly superior food, or whether CSD or SSD is better). Some people want to use Wayland as a vehicle to force applications to submit to their desktop s design philosophies, others prefer the smallest and leanest protocol possible, other developers want the most elegant behavior possible. To be clear, I think those are all very valid approaches. But this also creates problems: By switching to Wayland compositors, we are already forcing a lot of porting work onto toolkit developers and application developers. This is annoying, but just work that has to be done. It becomes frustrating though if Wayland provides toolkits with absolutely no way to reach their goal in any reasonable way. For Nate s Photoshop analogy: Of course Linux does not break Photoshop, it is Adobe s responsibility to port it. But what if Linux was missing a crucial syscall that Photoshop needed for proper functionality and Adobe couldn t port it without that? In that case it becomes much less clear on who is to blame for Photoshop not being available. A lot of Wayland protocol work is focused on the environment and design, while applications and work to port them often is considered less. I think this happens because the overlap between application developers and developers of the desktop environments is not necessarily large, and the overlap with people willing to engage with Wayland upstream is even smaller. The combination of Windows developers porting apps to Linux and having involvement with toolkits or Wayland is pretty much nonexistent. So they have less of a voice.

A quick detour through the neuroscience research lab I have been involved with Freedesktop, GNOME and KDE for an incredibly long time now (more than a decade), but my actual job (besides consulting for Purism) is that of a PhD candidate in a neuroscience research lab (working on the morphology of biological neurons and its relation to behavior). I am mostly involved with three research groups in our institute, which is about 35 people. Most of us do all our data analysis on powerful servers which we connect to using RDP (with KDE Plasma as desktop). Since I joined, I have been pushing the envelope a bit to extend Linux usage to data acquisition and regular clients, and to have our data acquisition hardware interface well with it. Linux brings some unique advantages for use in research, besides the obvious one of having every step of your data management platform introspectable with no black boxes left, a goal I value very highly in research (but this would be its own blogpost). In terms of operating system usage though, most systems are still Windows-based. Windows is what companies develop for, and what people use by default and are familiar with. The choice of operating system is very strongly driven by application availability, and WSL being really good makes this somewhat worse, as it removes the need for people to switch to a real Linux system entirely if there is the occasional software requiring it. Yet, we have a lot more Linux users than before, and use it in many places where it makes sense. I also developed a novel data acquisition software that even runs on Linux-only and uses the abilities of the platform to its fullest extent. All of this resulted in me asking existing software and hardware vendors for Linux support a lot more often. Vendor-customer relationship in science is usually pretty good, and vendors do usually want to help out. Same for open source projects, especially if you offer to do Linux porting work for them But overall, the ease of use and availability of required applications and their usability rules supreme. Most people are not technically knowledgeable and just want to get their research done in the best way possible, getting the best results with the least amount of friction.
KDE/Linux usage at a control station for a particle accelerator at Adlershof Technology Park, Germany, for reference (by 25years of KDE)3

Back to the point The point of that story is this: GNOME, KDE, RHEL, Debian or Ubuntu: They all do not matter if the necessary applications are not available for them. And as soon as they are, the easiest-to-use solution wins. There are many facets of easiest : In many cases this is RHEL due to Red Hat support contracts being available, in many other cases it is Ubuntu due to its mindshare and ease of use. KDE Plasma is also frequently seen, as it is perceived a bit easier to onboard Windows users with it (among other benefits). Ultimately, it comes down to applications and 3rd-party support though. Here s a dirty secret: In many cases, porting an application to Linux is not that difficult. The thing that companies (and FLOSS projects too!) struggle with and will calculate the merits of carefully in advance is whether it is worth the support cost as well as continuous QA/testing. Their staff will have to do all of that work, and they could spend that time on other tasks after all. So if they learn that porting to Linux not only means added testing and support, but also means to choose between the legacy X11 display server that allows for 1:1 porting from Windows or the new Wayland compositors that do not support the same features they need, they will quickly consider it not worth the effort at all. I have seen this happen. Of course many apps use a cross-platform toolkit like Qt, which greatly simplifies porting. But this just moves the issue one layer down, as now the toolkit needs to abstract Windows, macOS and Wayland. And Wayland does not contain features to do certain things or does them very differently from e.g. Windows, so toolkits have no way to actually implement the existing functionality in a way that works on all platforms. So in Qt s documentation you will often find texts like works everywhere except for on Wayland compositors or mobile 4. Many missing bits or altered behavior are just papercuts, but those add up. And if users will have a worse experience, this will translate to more support work, or people not wanting to use the software on the respective platform.

What s missing?

Window positioning SDI applications with multiple windows are very popular in the scientific world. For data acquisition (for example with microscopes) we often have one monitor with control elements and one larger one with the recorded image. There is also other configurations where multiple signal modalities are acquired, and the experimenter aligns windows exactly in the way they want and expects the layout to be stored and to be loaded upon reopening the application. Even in the image from Adlershof Technology Park above you can see this style of UI design, at mega-scale. Being able to pop-out elements as windows from a single-window application to move them around freely is another frequently used paradigm, and immensely useful with these complex apps. It is important to note that this is not a legacy design, but in many cases an intentional choice these kinds of apps work incredibly well on larger screens or many screens and are very flexible (you can have any window configuration you want, and switch between them using the (usually) great window management abilities of your desktop). Of course, these apps will work terribly on tablets and small form factors, but that is not the purpose they were designed for and nobody would use them that way. I assumed for sure these features would be implemented at some point, but when it became clear that that would not happen, I created the ext-placement protocol which had some good discussion but was ultimately rejected from the xdg namespace. I then tried another solution based on feedback, which turned out not to work for most apps, and now proposed xdg-placement (v2) in an attempt to maybe still get some protocol done that we can agree on, exploring more options before pushing the existing protocol for inclusion into the ext Wayland protocol namespace. Meanwhile though, we can not port any application that needs this feature, while at the same time we are switching desktops and distributions to Wayland by default.

Window position restoration Similarly, a protocol to save & restore window positions was already proposed in 2018, 6 years ago now, but it has still not been agreed upon, and may not even help multiwindow apps in its current form. The absence of this protocol means that applications can not restore their former window positions, and the user has to move them to their previous place again and again. Meanwhile, toolkits can not adopt these protocols and applications can not use them and can not be ported to Wayland without introducing papercuts.

Window icons Similarly, individual windows can not set their own icons, and not-installed applications can not have an icon at all because there is no desktop-entry file to load the icon from and no icon in the theme for them. You would think this is a niche issue, but for applications that create many windows, providing icons for them so the user can find them is fairly important. Of course it s not the end of the world if every window has the same icon, but it s one of those papercuts that make the software slightly less user-friendly. Even applications with fewer windows like LibrePCB are affected, so much so that they rather run their app through Xwayland for now. I decided to address this after I was working on data analysis of image data in a Python virtualenv, where my code and the Python libraries used created lots of windows all with the default yellow W icon, making it impossible to distinguish them at a glance. This is xdg-toplevel-icon now, but of course it is an uphill battle where the very premise of needing this is questioned. So applications can not use it yet.

Limited window abilities requiring specialized protocols Firefox has a picture-in-picture feature, allowing it to pop out media from a mediaplayer as separate floating window so the user can watch the media while doing other things. On X11 this is easily realized, but on Wayland the restrictions posed on windows necessitate a different solution. The xdg-pip protocol was proposed for this specialized usecase, but it is also not merged yet. So this feature does not work as well on Wayland.

Automated GUI testing / accessibility / automation Automation of GUI tasks is a powerful feature, so is the ability to auto-test GUIs. This is being worked on, with libei and wlheadless-run (and stuff like ydotool exists too), but we re not fully there yet.

Wayland is frustrating for (some) application authors As you see, there is valid applications and valid usecases that can not be ported yet to Wayland with the same feature range they enjoyed on X11, Windows or macOS. So, from an application author s perspective, Wayland does break things quite significantly, because things that worked before can no longer work and Wayland (the whole stack) does not provide any avenue to achieve the same result. Wayland does break screen sharing, global hotkeys, gaming latency (via no tearing ) etc, however for all of these there are solutions available that application authors can port to. And most developers will gladly do that work, especially since the newer APIs are usually a lot better and more robust. But if you give application authors no path forward except use Xwayland and be on emulation as second-class citizen forever , it just results in very frustrated application developers. For some application developers, switching to a Wayland compositor is like buying a canvas from the Linux shop that forces your brush to only draw triangles. But maybe for your avant-garde art, you need to draw a circle. You can approximate one with triangles, but it will never be as good as the artwork of your friends who got their canvases from the Windows or macOS art supply shop and have more freedom to create their art.

Triangles are proven to be the best shape! If you are drawing circles you are creating bad art! Wayland, via its protocol limitations, forces a certain way to build application UX often for the better, but also sometimes to the detriment of users and applications. The protocols are often fairly opinionated, a result of the lessons learned from X11. In any case though, it is the odd one out Windows and macOS do not pose the same limitations (for better or worse!), and the effort to port to Wayland is orders of magnitude bigger, or sometimes in case of the multiwindow UI paradigm impossible to achieve to the same level of polish. Desktop environments of course have a design philosophy that they want to push, and want applications to integrate as much as possible (same as macOS and Windows!). However, there are many applications out there, and pushing a design via protocol limitations will likely just result in fewer apps.

The porting dilemma I spent probably way too much time looking into how to get applications cross-platform and running on Linux, often talking to vendors (FLOSS and proprietary) as well. Wayland limitations aren t the biggest issue by far, but they do start to come come up now, especially in the scientific space with Ubuntu having switched to Wayland by default. For application authors there is often no way to address these issues. Many scientists do not even understand why their Python script that creates some GUIs suddenly behaves weirdly because Qt is now using the Wayland backend on Ubuntu instead of X11. They do not know the difference and also do not want to deal with these details even though they may be programmers as well, the real goal is not to fiddle with the display server, but to get to a scientific result somehow. Another issue is portability layers like Wine which need to run Windows applications as-is on Wayland. Apparently Wine s Wayland driver has some heuristics to make window positioning work (and I am amazed by the work done on this!), but that can only go so far.

A way out? So, how would we actually solve this? Fundamentally, this excessively long blog post boils down to just one essential question: Do we want to force applications to submit to a UX paradigm unconditionally, potentially loosing out on application ports or keeping apps on X11 eternally, or do we want to throw them some rope to get as many applications ported over to Wayland, even through we might sacrifice some protocol purity? I think we really have to answer that to make the discussions on wayland-protocols a lot less grueling. This question can be answered at the wayland-protocols level, but even more so it must be answered by the individual desktops and compositors. If the answer for your environment turns out to be Yes, we want the Wayland protocol to be more opinionated and will not make any compromises for application portability , then your desktop/compositor should just immediately NACK protocols that add something like this and you simply shouldn t engage in the discussion, as you reject the very premise of the new protocol: That it has any merit to exist and is needed in the first place. In this case contributors to Wayland and application authors also know where you stand, and a lot of debate is skipped. Of course, if application authors want to support your environment, you are basically asking them now to rewrite their UI, which they may or may not do. But at least they know what to expect and how to target your environment. If the answer turns out to be We do want some portability , the next question obviously becomes where the line should be drawn and which changes are acceptable and which aren t. We can t blindly copy all X11 behavior, some porting work to Wayland is simply inevitable. Some written rules for that might be nice, but probably more importantly, if you agree fundamentally that there is an issue to be fixed, please engage in the discussions for the respective MRs! We for sure do not want to repeat X11 mistakes, and I am certain that we can implement protocols which provide the required functionality in a way that is a nice compromise in allowing applications a path forward into the Wayland future, while also being as good as possible and improving upon X11. For example, the toplevel-icon proposal is already a lot better than anything X11 ever had. Relaxing ACK requirements for the ext namespace is also a good proposed administrative change, as it allows some compositors to add features they want to support to the shared repository easier, while also not mandating them for others. In my opinion, it would allow for a lot less friction between the two different ideas of how Wayland protocol development should work. Some compositors could move forward and support more protocol extensions, while more restrictive compositors could support less things. Applications can detect supported protocols at launch and change their behavior accordingly (ideally even abstracted by toolkits). You may now say that a lot of apps are ported, so surely this issue can not be that bad. And yes, what Wayland provides today may be enough for 80-90% of all apps. But what I hope the detour into the research lab has done is convince you that this smaller percentage of apps matters. A lot. And that it may be worthwhile to support them. To end on a positive note: When it came to porting concrete apps over to Wayland, the only real showstoppers so far5 were the missing window-positioning and window-position-restore features. I encountered them when porting my own software, and I got the issue as feedback from colleagues and fellow engineers. In second place was UI testing and automation support, the window-icon issue was mentioned twice, but being a cosmetic issue it likely simply hurts people less and they can ignore it easier. What this means is that the majority of apps are already fine, and many others are very, very close! A Wayland future for everyone is within our grasp!  I will also bring my two protocol MRs to their conclusion for sure, because as application developers we need clarity on what the platform (either all desktops or even just a few) supports and will or will not support in future. And the only way to get something good done is by contribution and friendly discussion.

Footnotes
  1. Apologies for the clickbait-y title it comes with the subject
  2. When I talk about Wayland I mean the combined set of display server protocols and accepted protocol extensions, unless otherwise clarified.
  3. I would have picked a picture from our lab, but that would have needed permission first
  4. Qt has awesome platform issues pages, like for macOS and Linux/X11 which help with porting efforts, but Qt doesn t even list Linux/Wayland as supported platform. There is some information though, like window geometry peculiarities, which aren t particularly helpful when porting (but still essential to know).
  5. Besides issues with Nvidia hardware CUDA for simulations and machine-learning is pretty much everywhere, so Nvidia cards are common, which causes trouble on Wayland still. It is improving though.

8 January 2024

Russ Allbery: Review: The Faithless

Review: The Faithless, by C.L. Clark
Series: Magic of the Lost #2
Publisher: Orbit
Copyright: March 2023
ISBN: 0-316-54283-0
Format: Kindle
Pages: 527
The Faithless is the second book in a political fantasy series that seems likely to be a trilogy. It is a direct sequel to The Unbroken, which you should read first. As usual, Orbit made it unnecessarily hard to get re-immersed in the world by refusing to provide memory aids for readers who read books as they come out instead of only when the series is complete, but this is not the fault of Clark or the book and you've heard me rant about this before. The Unbroken was set in Qaz l (not-Algeria). The Faithless, as readers of the first book might guess from the title, is set in Balladaire (not-France). This is the palace intrigue book. Princess Luca is fighting for her throne against her uncle, the regent. Touraine is trying to represent her people. Whether and to what extent those interests are aligned is much of the meat of this book. Normally I enjoy palace intrigue novels for the competence porn: watching someone navigate a complex political situation with skill and cunning, or upend the entire system by building unlikely coalitions or using unexpected routes to power. If you are similar, be warned that this is not what you're going to get. Touraine is a fish out of water with no idea how to navigate the Balladairan court, and does not magically become an expert in the course of this novel. Luca has the knowledge, but she's unsure, conflicted, and largely out-maneuvered. That means you will have to brace for some painful scenes of some of the worst people apparently getting what they want. Despite that, I could not put this down. It was infuriating, frustrating, and a much slower burn than I prefer, but the layers of complex motivations that Clark builds up provided a different sort of payoff. Two books in, the shape of this series is becoming clearer. This series is about empire and colonialism, but with considerably more complexity than fantasy normally brings to that topic. Power does not loosen its grasp easily, and it has numerous tools for subtle punishment after apparent upstart victories. Righteous causes rarely call banners to your side; instead, they create opportunities for other people to maneuver to their own advantage. Touraine has some amount of power now, but it's far from obvious how to use it. Her life's training tells her that exercising power will only cause trouble, and her enemies are more than happy to reinforce that message at every opportunity. Most notable to me is Clark's bitingly honest portrayal of the supposed allies within the colonial power. It is clear that Luca is attempting to take the most ethical actions as she defines them, but it's remarkable how those efforts inevitably imply that Touraine should help Luca now in exchange for Luca's tenuous and less-defined possible future aid. This is not even a lie; it may be an accurate summary of Balladairan politics. And yet, somehow what Balladaire needs always matters more than the needs of their abused colony. Underscoring this, Clark introduces another faction in the form of a populist movement against the Balladairan monarchy. The details of that setup in another fantasy novel would make them allies of the Qaz l. Here, as is so often the case in real life, a substantial portion of the populists are even more xenophobic and racist than the nobility. There are no easy alliances. The trump card that Qaz l holds is magic. They have it, and (for reasons explored in The Unbroken) Balladaire needs it, although that is a position held by Luca's faction and not by her uncle. But even Luca wants to reduce that magic to a manageable technology, like any other element of the Balladairan state. She wants to understand it, harness it, and bring it under local control. Touraine, trained by Balladaire and facing Balladairan political problems, has the same tendency. The magic, at least in this book, refuses not in the flashy, rebellious way that it would in most fantasy, but in a frustrating and incomprehensible lack of predictable or convenient rules. I think this will feel like a plot device to some readers, and that is to some extent true, but I think I see glimmers of Clark setting up a conflict of world views that will play out in the third book. I think some people are going to bounce off this book. It's frustrating, enraging, at times melodramatic, and does not offer the cathartic payoff typically offered in fantasy novels of this type. Usually these are things I would be complaining about as well. And yet, I found it satisfyingly challenging, engrossing, and memorable. I spent a lot of the book yelling "just kill him already" at the characters, but I think one of Clark's points is that overcoming colonial relationships requires a lot more than just killing one evil man. The characters profoundly fail to execute some clever and victorious strategy. Instead, as in the first book, they muddle through, making the best choice that they can see in each moment, making lots of mistakes, and paying heavy prices. It's realistic in a way that has nothing to do with blood or violence or grittiness. (Although I did appreciate having the thin thread of Pruett's story and its highly satisfying conclusion.) This is also a slow-burn romance, and there too I think opinions will differ. Touraine and Luca keep circling back to the same arguments and the same frustrations, and there were times that this felt repetitive. It also adds a lot of personal drama to the politics in a way that occasionally made me dubious. But here too, I think Clark is partly using the romance to illustrate the deeper political points. Luca is often insufferable, cruel and ambitious in ways she doesn't realize, and only vaguely able to understand the Qaz l perspective; in short, she's the pragmatic centrist reformer. I am dubious that her ethics would lead her to anything other than endless compromise without Touraine to push her. To Luca's credit, she also realizes that and wants to be a better person, but struggles to have the courage to act on it. Touraine both does and does not want to manipulate her; she wants Luca's help (and more), but it's not clear Luca will give it under acceptable terms, or even understand how much she's demanding. It's that foundational conflict that turns the romance into a slow burn by pushing them apart. Apparently I have more patience for this type of on-again, off-again relationship than one based on artificial miscommunication. The more I noticed the political subtext, the more engaging I found the romance on the surface. I picked this up because I'd read several books about black characters written by white authors, and while there was nothing that wrong with those books, the politics felt a little too reductionist and simplified. I wanted a book that was going to force me out of comfortable political assumptions. The Faithless did exactly what I was looking for, and I am definitely here for the rest of the series. In that sense, recommended, although do not go into this book hoping for adroit court maneuvering and competence porn. Followed by The Sovereign, which does not yet have a release date. Content warnings: Child death, attempted cultural genocide. Rating: 7 out of 10

27 December 2023

Russ Allbery: Review: A Study in Scarlet

Review: A Study in Scarlet, by Arthur Conan Doyle
Series: Sherlock Holmes #1
Publisher: AmazonClassics
Copyright: 1887
Printing: February 2018
ISBN: 1-5039-5525-7
Format: Kindle
Pages: 159
A Study in Scarlet is the short mystery novel (probably a novella, although I didn't count words) that introduced the world to Sherlock Holmes. I'm going to invoke the 100-year-rule and discuss the plot of this book rather freely on the grounds that even someone who (like me prior to a few days ago) has not yet read it is probably not that invested in avoiding all spoilers. If you do want to remain entirely unspoiled, exercise caution before reading on. I had somehow managed to avoid ever reading anything by Arthur Conan Doyle, not even a short story. I therefore couldn't be sure that some of the assertions I was making in my review of A Study in Honor were correct. Since A Study in Scarlet would be quick to read, I decided on a whim to do a bit of research and grab a free copy of the first Holmes novel. Holmes is such a part of English-speaking culture that I thought I had a pretty good idea of what to expect. This was largely true, but cultural osmosis had somehow not prepared me for the surprise Mormons. A Study in Scarlet establishes the basic parameters of a Holmes story: Dr. James Watson as narrator, the apartment he shares with Holmes at 221B Baker Street, the Baker Street Irregulars, Holmes's competition with police detectives, and his penchant for making leaps of logical deduction from subtle clues. The story opens with Watson meeting Holmes, agreeing to split the rent of a flat, and being baffled by the apparent randomness of Holmes's fields of study before Holmes reveals he's a consulting detective. The first case is a murder: a man is found dead in an abandoned house, without a mark on him although there are blood splatters on the walls and the word "RACHE" written in blood. Since my only prior exposure to Holmes was from cultural references and a few TV adaptations, there were a few things that surprised me. One is that Holmes is voluble and animated rather than aloof. Doyle is clearly going for passionate eccentric rather than calculating mastermind. Another is that he is intentionally and unabashedly ignorant on any topic not related to solving mysteries.
My surprise reached a climax, however, when I found incidentally that he was ignorant of the Copernican Theory and of the composition of the Solar System. That any civilized human being in this nineteenth century should not be aware that the earth travelled round the sun appeared to be to me such an extraordinary fact that I could hardly realize it. "You appear to be astonished," he said, smiling at my expression of surprise. "Now that I do know it I shall do my best to forget it." "To forget it!" "You see," he explained, "I consider that a man's brain originally is like a little empty attic, and you have to stock it with such furniture as you chose. A fool takes in all the lumber of every sort that he comes across, so that the knowledge which might be useful to him gets crowded out, or at best is jumbled up with a lot of other things so that he has a difficulty in laying his hands upon it. Now the skilful workman is very careful indeed as to what he takes into his brain-attic. He will have nothing but the tools which may help him in doing his work, but of these he has a large assortment, and all in the most perfect order. It is a mistake to think that that little room has elastic walls and can distend to any extent. Depend upon it there comes a time when for every addition of knowledge you forget something that you knew before. It is of the highest importance, therefore, not to have useless facts elbowing out the useful ones."
This is directly contrary to my expectation that the best way to make leaps of deduction is to know something about a huge range of topics so that one can draw unexpected connections, particularly given the puzzle-box construction and odd details so beloved in classic mysteries. I'm now curious if Doyle stuck with this conception, and if there were any later mysteries that involved astronomy. Speaking of classic mysteries, A Study in Scarlet isn't quite one, although one can see the shape of the genre to come. Doyle does not "play fair" by the rules that have not yet been invented. Holmes at most points knows considerably more than the reader, including bits of evidence that are not described until Holmes describes them and research that Holmes does off-camera and only reveals when he wants to be dramatic. This is not the sort of story where the reader is encouraged to try to figure out the mystery before the detective. Rather, what Doyle seems to be aiming for, and what Watson attempts (unsuccessfully) as the reader surrogate, is slightly different: once Holmes makes one of his grand assertions, the reader is encouraged to guess what Holmes might have done to arrive at that conclusion. Doyle seems to want the reader to guess technique rather than outcome, while providing only vague clues in general descriptions of Holmes's behavior at a crime scene. The structure of this story is quite odd. The first part is roughly what you would expect: first-person narration from Watson, supposedly taken from his journals but not at all in the style of a journal and explicitly written for an audience. Part one concludes with Holmes capturing and dramatically announcing the name of the killer, who the reader has never heard of before. Part two then opens with... a western?
In the central portion of the great North American Continent there lies an arid and repulsive desert, which for many a long year served as a barrier against the advance of civilization. From the Sierra Nevada to Nebraska, and from the Yellowstone River in the north to the Colorado upon the south, is a region of desolation and silence. Nor is Nature always in one mood throughout the grim district. It comprises snow-capped and lofty mountains, and dark and gloomy valleys. There are swift-flowing rivers which dash through jagged ca ons; and there are enormous plains, which in winter are white with snow, and in summer are grey with the saline alkali dust. They all preserve, however, the common characteristics of barrenness, inhospitality, and misery.
First, I have issues with the geography. That region contains some of the most beautiful areas on earth, and while a lot of that region is arid, describing it primarily as a repulsive desert is a bit much. Doyle's boundaries and distances are also confusing: the Yellowstone is a northeast-flowing river with its source in Wyoming, so the area between it and the Colorado does not extend to the Sierra Nevadas (or even to Utah), and it's not entirely clear to me that he realizes Nevada exists. This is probably what it's like for people who live anywhere else in the world when US authors write about their country. But second, there's no Holmes, no Watson, and not even the pretense of a transition from the detective novel that we were just reading. Doyle just launches into a random western with an omniscient narrator. It features a lean, grizzled man and an adorable child that he adopts and raises into a beautiful free spirit, who then falls in love with a wild gold-rush adventurer. This was written about 15 years before the first critically recognized western novel, so I can't blame Doyle for all the cliches here, but to a modern reader all of these characters are straight from central casting. Well, except for the villains, who are the Mormons. By that, I don't mean that the villains are Mormon. I mean Brigham Young is the on-page villain, plotting against the hero to force his adopted daughter into a Mormon harem (to use the word that Doyle uses repeatedly) and ruling Salt Lake City with an iron hand, border guards with passwords (?!), and secret police. This part of the book was wild. I was laughing out-loud at the sheer malevolent absurdity of the thirty-day countdown to marriage, which I doubt was the intended effect. We do eventually learn that this is the backstory of the murder, but we don't return to Watson and Holmes for multiple chapters. Which leads me to the other thing that surprised me: Doyle lays out this backstory, but then never has his characters comment directly on the morality of it, only the spectacle. Holmes cares only for the intellectual challenge (and for who gets credit), and Doyle sets things up so that the reader need not concern themselves with aftermath, punishment, or anything of that sort. I probably shouldn't have been surprised this does fit with the Holmes stereotype but I'm used to modern fiction where there is usually at least some effort to pass judgment on the events of the story. Doyle draws very clear villains, but is utterly silent on whether the murder is justified. Given its status in the history of literature, I'm not sorry to have read this book, but I didn't particularly enjoy it. It is very much of its time: everyone's moral character is linked directly to their physical appearance, and Doyle uses the occasional racial stereotype without a second thought. Prevailing writing styles have changed, so the prose feels long-winded and breathless. The rivalry between Holmes and the police detectives is tedious and annoying. I also find it hard to read novels from before the general absorption of techniques of emotional realism and interiority into all genres. The characters in A Study in Scarlet felt more like cartoon characters than fully-realized human beings. I have no strong opinion about the objective merits of this book in the context of its time other than to note that the sudden inserted western felt very weird. My understanding is that this is not considered one of the better Holmes stories, and Holmes gets some deeper characterization later on. Maybe I'll try another of Doyle's works someday, but for now my curiosity has been sated. Followed by The Sign of the Four. Rating: 4 out of 10

25 December 2023

Sergio Talens-Oliag: GitLab CI/CD Tips: Automatic Versioning Using semantic-release

This post describes how I m using semantic-release on gitlab-ci to manage versioning automatically for different kinds of projects following a simple workflow (a develop branch where changes are added or merged to test new versions, a temporary release/#.#.# to generate the release candidate versions and a main branch where the final versions are published).

What is semantic-releaseIt is a Node.js application designed to manage project versioning information on Git Repositories using a Continuous integration system (in this post we will use gitlab-ci)

How does it workBy default semantic-release uses semver for versioning (release versions use the format MAJOR.MINOR.PATCH) and commit messages are parsed to determine the next version number to publish. If after analyzing the commits the version number has to be changed, the command updates the files we tell it to (i.e. the package.json file for nodejs projects and possibly a CHANGELOG.md file), creates a new commit with the changed files, creates a tag with the new version and pushes the changes to the repository. When running on a CI/CD system we usually generate the artifacts related to a release (a package, a container image, etc.) from the tag, as it includes the right version number and usually has passed all the required tests (it is a good idea to run the tests again in any case, as someone could create a tag manually or we could run extra jobs when building the final assets if they fail it is not a big issue anyway, numbers are cheap and infinite, so we can skip releases if needed).

Commit messages and versioningThe commit messages must follow a known format, the default module used to analyze them uses the angular git commit guidelines, but I prefer the conventional commits one, mainly because it s a lot easier to use when you want to update the MAJOR version. The commit message format used must be:
<type>(optional scope): <description>
[optional body]
[optional footer(s)]
The system supports three types of branches: release, maintenance and pre-release, but for now I m not using maintenance ones. The branches I use and their types are:
  • main as release branch (final versions are published from there)
  • develop as pre release branch (used to publish development and testing versions with the format #.#.#-SNAPSHOT.#)
  • release/#.#.# as pre release branches (they are created from develop to publish release candidate versions with the format #.#.#-rc.# and once they are merged with main they are deleted)
On the release branch (main) the version number is updated as follows:
  1. The MAJOR number is incremented if a commit with a BREAKING CHANGE: footer or an exclamation (!) after the type/scope is found in the list of commits found since the last version change (it looks for tags on the same branch).
  2. The MINOR number is incremented if the MAJOR number is not going to be changed and there is a commit with type feat in the commits found since the last version change.
  3. The PATCH number is incremented if neither the MAJOR nor the MINOR numbers are going to be changed and there is a commit with type fix in the the commits found since the last version change.
On the pre release branches (develop and release/#.#.#) the version and pre release numbers are always calculated from the last published version available on the branch (i. e. if we published version 1.3.2 on main we need to have the commit with that tag on the develop or release/#.#.# branch to get right what will be the next version). The version number is updated as follows:
  1. The MAJOR number is incremented if a commit with a BREAKING CHANGE: footer or an exclamation (!) after the type/scope is found in the list of commits found since the last released version.In our example it was 1.3.2 and the version is updated to 2.0.0-SNAPSHOT.1 or 2.0.0-rc.1 depending on the branch.
  2. The MINOR number is incremented if the MAJOR number is not going to be changed and there is a commit with type feat in the commits found since the last released version.In our example the release was 1.3.2 and the version is updated to 1.4.0-SNAPSHOT.1 or 1.4.0-rc.1 depending on the branch.
  3. The PATCH number is incremented if neither the MAJOR nor the MINOR numbers are going to be changed and there is a commit with type fix in the the commits found since the last version change.In our example the release was 1.3.2 and the version is updated to 1.3.3-SNAPSHOT.1 or 1.3.3-rc.1 depending on the branch.
  4. The pre release number is incremented if the MAJOR, MINOR and PATCH numbers are not going to be changed but there is a commit that would otherwise update the version (i.e. a fix on 1.3.3-SNAPSHOT.1 will set the version to 1.3.3-SNAPSHOT.2, a fix or feat on 1.4.0-rc.1 will set the version to 1.4.0-rc.2 an so on).

How do we manage its configurationAlthough the system is designed to work with nodejs projects, it can be used with multiple programming languages and project types. For nodejs projects the usual place to put the configuration is the project s package.json, but I prefer to use the .releaserc file instead. As I use a common set of CI templates, instead of using a .releaserc on each project I generate it on the fly on the jobs that need it, replacing values related to the project type and the current branch on a template using the tmpl command (lately I use a branch of my own fork while I wait for some feedback from upstream, as you will see on the Dockerfile).

Container used to run itAs we run the command on a gitlab-ci job we use the image built from the following Dockerfile:
Dockerfile
# Semantic release image
FROM golang:alpine AS tmpl-builder
#RUN go install github.com/krakozaure/tmpl@v0.4.0
RUN go install github.com/sto/tmpl@v0.4.0-sto.2
FROM node:lts-alpine
COPY --from=tmpl-builder /go/bin/tmpl /usr/local/bin/tmpl
RUN apk update &&\
  apk upgrade &&\
  apk add curl git jq openssh-keygen yq zip &&\
  npm install --location=global\
    conventional-changelog-conventionalcommits@6.1.0\
    @qiwi/multi-semantic-release@7.0.0\
    semantic-release@21.0.7\
    @semantic-release/changelog@6.0.3\
    semantic-release-export-data@1.0.1\
    @semantic-release/git@10.0.1\
    @semantic-release/gitlab@9.5.1\
    @semantic-release/release-notes-generator@11.0.4\
    semantic-release-replace-plugin@1.2.7\
    semver@7.5.4\
  &&\
  rm -rf /var/cache/apk/*
CMD ["/bin/sh"]

How and when is it executedThe job that runs semantic-release is executed when new commits are added to the develop, release/#.#.# or main branches (basically when something is merged or pushed) and after all tests have passed (we don t want to create a new version that does not compile or passes at least the unit tests). The job is something like the following:
semantic_release:
  image: $SEMANTIC_RELEASE_IMAGE
  rules:
    - if: '$CI_COMMIT_BRANCH =~ /^(develop main release\/\d+.\d+.\d+)$/'
      when: always
  stage: release
  before_script:
    - echo "Loading scripts.sh"
    - . $ASSETS_DIR/scripts.sh
  script:
    - sr_gen_releaserc_json
    - git_push_setup
    - semantic-release
Where the SEMANTIC_RELEASE_IMAGE variable contains the URI of the image built using the Dockerfile above and the sr_gen_releaserc_json and git_push_setup are functions defined on the $ASSETS_DIR/scripts.sh file:
  • The sr_gen_releaserc_json function generates the .releaserc.json file using the tmpl command.
  • The git_push_setup function configures git to allow pushing changes to the repository with the semantic-release command, optionally signing them with a SSH key.

The sr_gen_releaserc_json functionThe code for the sr_gen_releaserc_json function is the following:
sr_gen_releaserc_json()
 
  # Use nodejs as default project_type
  project_type="$ PROJECT_TYPE:-nodejs "
  # REGEX to match the rc_branch name
  rc_branch_regex='^release\/[0-9]\+\.[0-9]\+\.[0-9]\+$'
  # PATHS on the local ASSETS_DIR
  assets_dir="$ CI_PROJECT_DIR /$ ASSETS_DIR "
  sr_local_plugin="$ assets_dir /local-plugin.cjs"
  releaserc_tmpl="$ assets_dir /releaserc.json.tmpl"
  pipeline_runtime_values_yaml="/tmp/releaserc_values.yaml"
  pipeline_values_yaml="$ assets_dir /values_$ project_type _project.yaml"
  # Destination PATH
  releaserc_json=".releaserc.json"
  # Create an empty pipeline_values_yaml if missing
  test -f "$pipeline_values_yaml"   : >"$pipeline_values_yaml"
  # Create the pipeline_runtime_values_yaml file
  echo "branch: $ CI_COMMIT_BRANCH " >"$pipeline_runtime_values_yaml"
  echo "gitlab_url: $ CI_SERVER_URL " >"$pipeline_runtime_values_yaml"
  # Add the rc_branch name if we are on an rc_branch
  if [ "$(echo "$CI_COMMIT_BRANCH"   sed -ne "/$rc_branch_regex/ p ")" ]; then
    echo "rc_branch: $ CI_COMMIT_BRANCH " >>"$pipeline_runtime_values_yaml"
  elif [ "$(echo "$CI_MERGE_REQUEST_SOURCE_BRANCH_NAME"  
      sed -ne "/$rc_branch_regex/ p ")" ]; then
    echo "rc_branch: $ CI_MERGE_REQUEST_SOURCE_BRANCH_NAME " \
      >>"$pipeline_runtime_values_yaml"
  fi
  echo "sr_local_plugin: $ sr_local_plugin " >>"$pipeline_runtime_values_yaml"
  # Create the releaserc_json file
  tmpl -f "$pipeline_runtime_values_yaml" -f "$pipeline_values_yaml" \
    "$releaserc_tmpl"   jq . >"$releaserc_json"
  # Remove the pipeline_runtime_values_yaml file
  rm -f "$pipeline_runtime_values_yaml"
  # Print the releaserc_json file
  print_file_collapsed "$releaserc_json"
  # --*-- BEG: NOTE --*--
  # Rename the package.json to ignore it when calling semantic release.
  # The idea is that the local-plugin renames it back on the first step of the
  # semantic-release process.
  # --*-- END: NOTE --*--
  if [ -f "package.json" ]; then
    echo "Renaming 'package.json' to 'package.json_disabled'"
    mv "package.json" "package.json_disabled"
  fi
 
Almost all the variables used on the function are defined by gitlab except the ASSETS_DIR and PROJECT_TYPE; in the complete pipelines the ASSETS_DIR is defined on a common file included by all the pipelines and the project type is defined on the .gitlab-ci.yml file of each project. If you review the code you will see that the file processed by the tmpl command is named releaserc.json.tmpl, its contents are shown here:
 
  "plugins": [
     - if .sr_local_plugin  
    "  .sr_local_plugin  ",
     - end  
    [
      "@semantic-release/commit-analyzer",
       
        "preset": "conventionalcommits",
        "releaseRules": [
            "breaking": true, "release": "major"  ,
            "revert": true, "release": "patch"  ,
            "type": "feat", "release": "minor"  ,
            "type": "fix", "release": "patch"  ,
            "type": "perf", "release": "patch"  
        ]
       
    ],
     - if .replacements  
    [
      "semantic-release-replace-plugin",
        "replacements":   .replacements   toJson    
    ],
     - end  
    "@semantic-release/release-notes-generator",
     - if eq .branch "main"  
    [
      "@semantic-release/changelog",
        "changelogFile": "CHANGELOG.md", "changelogTitle": "# Changelog"  
    ],
     - end  
    [
      "@semantic-release/git",
       
        "assets":   if .assets   .assets   toJson   else  []  end  ,
        "message": "ci(release): v$ nextRelease.version \n\n$ nextRelease.notes "
       
    ],
    [
      "@semantic-release/gitlab",
        "gitlabUrl": "  .gitlab_url  ", "successComment": false  
    ]
  ],
  "branches": [
      "name": "develop", "prerelease": "SNAPSHOT"  ,
     - if .rc_branch  
      "name": "  .rc_branch  ", "prerelease": "rc"  ,
     - end  
    "main"
  ]
 
The values used to process the template are defined on a file built on the fly (releaserc_values.yaml) that includes the following keys and values:
  • branch: the name of the current branch
  • gitlab_url: the URL of the gitlab server (the value is taken from the CI_SERVER_URL variable)
  • rc_branch: the name of the current rc branch; we only set the value if we are processing one because semantic-release only allows one branch to match the rc prefix and if we use a wildcard (i.e. release/*) but the users keep more than one release/#.#.# branch open at the same time the calls to semantic-release will fail for sure.
  • sr_local_plugin: the path to the local plugin we use (shown later)
The template also uses a values_$ project_type _project.yaml file that includes settings specific to the project type, the one for nodejs is as follows:
replacements:
  - files:
      - "package.json"
    from: "\"version\": \".*\""
    to: "\"version\": \"$ nextRelease.version \""
assets:
  - "CHANGELOG.md"
  - "package.json"
The replacements section is used to update the version field on the relevant files of the project (in our case the package.json file) and the assets section includes the files that will be committed to the repository when the release is published (looking at the template you can see that the CHANGELOG.md is only updated for the main branch, we do it this way because if we update the file on other branches it creates a merge nightmare and we are only interested on it for released versions anyway). The local plugin adds code to rename the package.json_disabled file to package.json if present and prints the last and next versions on the logs with a format that can be easily parsed using sed:
local-plugin.cjs
// Minimal plugin to:
// - rename the package.json_disabled file to package.json if present
// - log the semantic-release last & next versions
function verifyConditions(pluginConfig, context)  
  var fs = require('fs');
  if (fs.existsSync('package.json_disabled'))  
    fs.renameSync('package.json_disabled', 'package.json');
    context.logger.log( verifyConditions: renamed 'package.json_disabled' to 'package.json' );
   
 
function analyzeCommits(pluginConfig, context)  
  if (context.lastRelease && context.lastRelease.version)  
    context.logger.log( analyzeCommits: LAST_VERSION=$ context.lastRelease.version  );
   
 
function verifyRelease(pluginConfig, context)  
  if (context.nextRelease && context.nextRelease.version)  
    context.logger.log( verifyRelease: NEXT_VERSION=$ context.nextRelease.version  );
   
 
module.exports =  
  verifyConditions,
  analyzeCommits,
  verifyRelease
 

The git_push_setup functionThe code for the git_push_setup function is the following:
git_push_setup()
 
  # Update global credentials to allow git clone & push for all the group repos
  git config --global credential.helper store
  cat >"$HOME/.git-credentials" <<EOF
https://fake-user:$ GITLAB_REPOSITORY_TOKEN @gitlab.com
EOF
  # Define user name, mail and signing key for semantic-release
  user_name="$SR_USER_NAME"
  user_email="$SR_USER_EMAIL"
  ssh_signing_key="$SSH_SIGNING_KEY"
  # Export git user variables
  export GIT_AUTHOR_NAME="$user_name"
  export GIT_AUTHOR_EMAIL="$user_email"
  export GIT_COMMITTER_NAME="$user_name"
  export GIT_COMMITTER_EMAIL="$user_email"
  # Sign commits with ssh if there is a SSH_SIGNING_KEY variable
  if [ "$ssh_signing_key" ]; then
    echo "Configuring GIT to sign commits with SSH"
    ssh_keyfile="/tmp/.ssh-id"
    : >"$ssh_keyfile"
    chmod 0400 "$ssh_keyfile"
    echo "$ssh_signing_key"   tr -d '\r' >"$ssh_keyfile"
    git config gpg.format ssh
    git config user.signingkey "$ssh_keyfile"
    git config commit.gpgsign true
  fi
 
The function assumes that the GITLAB_REPOSITORY_TOKEN variable (set on the CI/CD variables section of the project or group we want) contains a token with read_repository and write_repository permissions on all the projects we are going to use this function. The SR_USER_NAME and SR_USER_EMAIL variables can be defined on a common file or the CI/CD variables section of the project or group we want to work with and the script assumes that the optional SSH_SIGNING_KEY is exported as a CI/CD default value of type variable (that is why the keyfile is created on the fly) and git is configured to use it if the variable is not empty.
Warning: Keep in mind that the variables GITLAB_REPOSITORY_TOKEN and SSH_SIGNING_KEY contain secrets, so probably is a good idea to make them protected (if you do that you have to make the develop, main and release/* branches protected too).
Warning: The semantic-release user has to be able to push to all the projects on those protected branches, it is a good idea to create a dedicated user and add it as a MAINTAINER for the projects we want (the MAINTAINERS need to be able to push to the branches), or, if you are using a Gitlab with a Premium license you can use the api to allow the semantic-release user to push to the protected branches without allowing it for any other user.

The semantic-release commandOnce we have the .releaserc file and the git configuration ready we run the semantic-release command. If the branch we are working with has one or more commits that will increment the version, the tool does the following (note that the steps are described are the ones executed if we use the configuration we have generated):
  1. It detects the commits that will increment the version and calculates the next version number.
  2. Generates the release notes for the version.
  3. Applies the replacements defined on the configuration (in our example updates the version field on the package.json file).
  4. Updates the CHANGELOG.md file adding the release notes if we are going to publish the file (when we are on the main branch).
  5. Creates a commit if all or some of the files listed on the assets key have changed and uses the commit message we have defined, replacing the variables for their current values.
  6. Creates a tag with the new version number and the release notes.
  7. As we are using the gitlab plugin after tagging it also creates a release on the project with the tag name and the release notes.

Notes about the git workflows and merges between branchesIt is very important to remember that semantic-release looks at the commits of a given branch when calculating the next version to publish, that has two important implications:
  1. On pre release branches we need to have the commit that includes the tag with the released version, if we don t have it the next version is not calculated correctly.
  2. It is a bad idea to squash commits when merging a branch to another one, if we do that we will lose the information semantic-release needs to calculate the next version and even if we use the right prefix for the squashed commit (fix, feat, ) we miss all the messages that would otherwise go to the CHANGELOG.md file.
To make sure that we have the right commits on the pre release branches we should merge the main branch changes into the develop one after each release tag is created; in my pipelines the fist job that processes a release tag creates a branch from the tag and an MR to merge it to develop. The important thing about that MR is that is must not be squashed, if we do that the tag commit will probably be lost, so we need to be careful. To merge the changes directly we can run the following code:
# Set the SR_TAG variable to the tag you want to process
SR_TAG="v1.3.2"
# Fetch all the changes
git fetch --all --prune
# Switch to the main branch
git switch main
# Pull all the changes
git pull
# Switch to the development branch
git switch develop
# Pull all the changes
git pull
# Create followup branch from tag
git switch -c "followup/$SR_TAG" "$SR_TAG"
# Change files manually & commit the changed files
git commit -a --untracked-files=no -m "ci(followup): $SR_TAG to develop"
# Switch to the development branch
git switch develop
# Merge the followup branch into the development one using the --no-ff option
git merge --no-ff "followup/$SR_TAG"
# Remove the followup branch
git branch -d "followup/$SR_TAG"
# Push the changes
git push
If we can t push directly to develop we can create a MR pushing the followup branch after committing the changes, but we have to make sure that we don t squash the commits when merging or it will not work as we want.

21 December 2023

Russ Allbery: Review: The Box

Review: The Box, by Marc Levinson
Publisher: Princeton University Press
Copyright: 2006, 2008
Printing: 2008
ISBN: 0-691-13640-8
Format: Trade paperback
Pages: 278
The shipping container as we know it is only about 65 years old. Shipping things in containers is obviously much older; we've been doing that for longer than we've had ships. But the standardized metal box, set on a rail car or loaded with hundreds of its indistinguishable siblings into an enormous, specially-designed cargo ship, became economically significant only recently. Today it is one of the oft-overlooked foundations of global supply chains. The startlingly low cost of container shipping is part of why so much of what US consumers buy comes from Asia, and why most complex machinery is assembled in multiple countries from parts gathered from a dizzying variety of sources. Marc Levinson's The Box is a history of container shipping, from its (arguable) beginnings in the trailer bodies loaded on Pan-Atlantic Steamship Corporation's Ideal-X in 1956 to just-in-time international supply chains in the 2000s. It's a popular history that falls on the academic side, with a full index and 60 pages of citations and other notes. (Per my normal convention, those pages aren't included in the sidebar page count.) The Box is organized mostly chronologically, but Levinson takes extended detours into labor relations and container standardization at the appropriate points in the timeline. The book is very US-centric. Asian, European, and Australian shipping is discussed mostly in relation to trade with the US, and Africa is barely mentioned. I don't have the background to know whether this is historically correct for container shipping or is an artifact of Levinson's focus. Many single-item popular histories focus on something that involves obvious technological innovation (paint pigments) or deep cultural resonance (salt) or at least entertaining quirkiness (punctuation marks, resignation letters). Shipping containers are important but simple and boring. The least interesting chapter in The Box covers container standardization, in which a whole bunch of people had boring meetings, wrote some things done, discovered many of the things they wrote down were dumb, wrote more things down, met with different people to have more meetings, published a standard that partly reflected the fixations of that one guy who is always involved in standards discussions, and then saw that standard be promptly ignored by the major market players. You may be wondering if that describes the whole book. It doesn't, but not because of the shipping containers. The Box is interesting because the process of economic change is interesting, and container shipping is almost entirely about business processes rather than technology. Levinson starts the substance of the book with a description of shipping before standardized containers. This was the most effective, and probably the most informative, chapter. Beyond some vague ideas picked up via cultural osmosis, I had no idea how cargo shipping worked. Levinson gives the reader a memorable feel for the sheer amount of physical labor involved in loading and unloading a ship with mixed cargo (what's called "breakbulk" cargo to distinguish it from bulk cargo like coal or wheat that fills an entire hold). It's not just the effort of hauling barrels, bales, or boxes with cranes or raw muscle power, although that is significant. It's also the need to touch every piece of cargo to move it, inventory it, warehouse it, and then load it on a truck or train. The idea of container shipping is widely attributed, including by Levinson, to Malcom McLean, a trucking magnate who became obsessed with the idea of what we now call intermodal transport: using the same container for goods on ships, railroads, and trucks so that the contents don't have to be unpacked and repacked at each transfer point. Levinson uses his career as an anchor for the story, from his acquisition of Pan-American Steamship Corporation to pursue his original idea (backed by private equity and debt, in a very modern twist), through his years running Sea-Land as the first successful major container shipper, and culminating in his disastrous attempted return to shipping by acquiring United States Lines. I am dubious of Great Man narratives in history books, and I think Levinson may be overselling McLean's role. Container shipping was an obvious idea that the industry had been talking about for decades. Even Levinson admits that, despite a few gestures at giving McLean central credit. Everyone involved in shipping understood that cargo handling was the most expensive and time-consuming part, and that if one could minimize cargo handling at the docks by loading and unloading full containers that didn't have to be opened, shipping costs would be much lower (and profits higher). The idea wasn't the hard part. McLean was the first person to pull it off at scale, thanks to some audacious economic risks and a willingness to throw sharp elbows and play politics, but it seems likely that someone else would have played that role if McLean hadn't existed. Container shipping didn't happen earlier because achieving that cost savings required a huge expenditure of capital and a major disruption of a transportation industry that wasn't interested in being disrupted. The ships had to be remodeled and eventually replaced; manufacturing had to change; railroad and trucking in theory had to change (in practice, intermodal transport; McLean's obsession, didn't happen at scale until much later); pricing had to be entirely reworked; logistical tracking of goods had to be done much differently; and significant amounts of extremely expensive equipment to load and unload heavy containers had to be designed, built, and installed. McLean's efforts proved the cost savings was real and compelling, but it still took two decades before the shipping industry reconstructed itself around containers. That interim period is where this history becomes a labor story, and that's where Levinson's biases become somewhat distracting. In the United States, loading and unloading of cargo ships was done by unionized longshoremen through a bizarre but complex and long-standing system of contract hiring. The cost savings of container shipping comes almost completely from the loss of work for longshoremen. It's a classic replacement of labor with capital; the work done by gangs of twenty or more longshoreman is instead done by a single crane operator at much higher speed and efficiency. The longshoreman unions therefore opposed containerization and launched numerous strikes and other labor actions to delay use of containers, force continued hiring that containers made unnecessary, or win buyouts and payoffs for current longshoremen. Levinson is trying to write a neutral history and occasionally shows some sympathy for longshoremen, but they still get the Luddite treatment in this book: the doomed reactionaries holding back progress. Longshoremen had a vigorous and powerful union that won better working conditions structured in ways that look absurd to outsiders, such as requiring that ships hire twice as many men as necessary so that half of them could get paid while not working. The unions also had a reputation for corruption that Levinson stresses constantly, and theft of breakbulk cargo during loading and warehousing was common. One of the interesting selling points for containers was that lossage from theft during shipping apparently decreased dramatically. It's obvious that the surface demand of the longshoremen unions, that either containers not be used or that just as many manual laborers be hired for container shipping as for earlier breakbulk shipping, was impossible, and that the profession as it existed in the 1950s was doomed. But beneath those facts, and the smoke screen of Levinson's obvious distaste for their unions, is a real question about what society owes workers whose jobs are eliminated by major shifts in business practices. That question of fairness becomes more pointed when one realizes that this shift was massively subsidized by US federal and local governments. McLean's Sea-Land benefited from direct government funding and subsidized navy surplus ships, massive port construction in New Jersey with public funds, and a sweetheart logistics contract from the US military to supply troops fighting the Vietnam War that was so generous that the return voyage was free and every container Sea-Land picked up from Japanese ports was pure profit. The US shipping industry was heavily government-supported, particularly in the early days when the labor conflicts were starting. Levinson notes all of this, but never draws the contrast between the massive support for shipping corporations and the complete lack of formal support for longshoremen. There are hard ethical questions about what society owes displaced workers even in a pure capitalist industry transformation, and this was very far from pure capitalism. The US government bankrolled large parts of the growth of container shipping, but the only way that longshoremen could get part of that money was through strikes to force payouts from private shipping companies. There are interesting questions of social and ethical history here that would require careful disentangling of the tendency of any group to oppose disruptive change and fairness questions of who gets government support and who doesn't. They will have to wait for another book; Levinson never mentions them. There were some things about this book that annoyed me, but overall it's a solid work of popular history and deserves its fame. Levinson's account is easy to follow, specific without being tedious, and backed by voluminous notes. It's not the most compelling story on its own merits; you have to have some interest in logistics and economics to justify reading the entire saga. But it's the sort of history that gives one a sense of the fractal complexity of any area of human endeavor, and I usually find those worth reading. Recommended if you like this sort of thing. Rating: 7 out of 10

12 December 2023

Freexian Collaborators: Monthly report about Debian Long Term Support, November 2023 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering. Some notable fixes which were made in LTS during the month of November include the gnutls28 cryptographic library and the freerdp2 Remote Desktop Protocol client/server implementation. The gnutls28 update was prepared by LTS contributor Markus Koschany and dealt with a timing attack which could be used to compromise a cryptographic system, while the freerdp2 update was prepared by LTS contributor Tobias Frost and is the result of work spanning 3 months to deal with dozens of vulnerabilities. In addition to the many ordinary LTS tasks which were completed (CVE triage, patch backports, package updates, etc), there were several contributions by LTS contributors for the benefit of Debian stable and old-stable releases, as well as for the benefit of upstream projects. LTS contributor Abhijith PA uploaded an update of the puma package to unstable in order to fix a vulnerability in that package while LTS contributor Thosten Alteholz sponsored an upload to unstable of libde265 and himself made corresponding uploads of libde265 to Debian stable and old-stable. LTS contributor Bastien Roucari s developed patches for vulnerabilities in zbar and audiofile which were then provided to the respective upstream projects. Updates to packages in Debian stable were made by Markus Koschany to deal with security vulnerabilities and by Chris Lamb to deal with some non-security bugs. As always, the LTS strives to provide high quality updates to packages under the direct purview of the LTS team while also rendering assistance to maintainers, the stable security team, and upstream developers whenever practical.

Debian LTS contributors In November, 18 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 7.0h (out of 0h assigned and 14.0h from previous period), thus carrying over 7.0h to the next month.
  • Adrian Bunk did 15.0h (out of 14.0h assigned and 9.75h from previous period), thus carrying over 8.75h to the next month.
  • Anton Gladky did 10.0h (out of 9.5h assigned and 5.5h from previous period), thus carrying over 5.0h to the next month.
  • Bastien Roucari s did 16.0h (out of 18.25h assigned and 1.75h from previous period), thus carrying over 4.0h to the next month.
  • Ben Hutchings did 12.0h (out of 16.5h assigned and 12.25h from previous period), thus carrying over 16.75h to the next month.
  • Chris Lamb did 18.0h (out of 17.25h assigned and 0.75h from previous period).
  • Emilio Pozuelo Monfort did 15.5h (out of 23.5h assigned and 0.25h from previous period), thus carrying over 8.25h to the next month.
  • Guilhem Moulin did 13.0h (out of 12.0h assigned and 8.0h from previous period), thus carrying over 7.0h to the next month.
  • Lee Garrett did 14.5h (out of 16.75h assigned and 7.0h from previous period), thus carrying over 9.25h to the next month.
  • Markus Koschany did 30.0h (out of 30.0h assigned).
  • Ola Lundqvist did 6.5h (out of 8.25h assigned and 15.5h from previous period), thus carrying over 17.25h to the next month.
  • Roberto C. S nchez did 5.5h (out of 12.0h assigned), thus carrying over 6.5h to the next month.
  • Santiago Ruano Rinc n did 3.25h (out of 13.62h assigned and 2.375h from previous period), thus carrying over 12.745h to the next month.
  • Sean Whitton did 3.25h (out of 10.0h assigned), thus carrying over 6.75h to the next month.
  • Sylvain Beucler did 10.0h (out of 13.5h assigned and 10.25h from previous period), thus carrying over 13.75h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Tobias Frost did 12.0h (out of 12.0h assigned).
  • Utkarsh Gupta did 0.0h (out of 6.0h assigned and 17.75h from previous period), thus carrying over 23.75h to the next month.

Evolution of the situation In November, we have released 35 DLAs.

Thanks to our sponsors Sponsors that joined recently are in bold.

6 December 2023

Reproducible Builds: Reproducible Builds in November 2023

Welcome to the November 2023 report from the Reproducible Builds project! In these reports we outline the most important things that we have been up to over the past month. As a rather rapid recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries (more).

Reproducible Builds Summit 2023 Between October 31st and November 2nd, we held our seventh Reproducible Builds Summit in Hamburg, Germany! Amazingly, the agenda and all notes from all sessions are all online many thanks to everyone who wrote notes from the sessions. As a followup on one idea, started at the summit, Alexander Couzens and Holger Levsen started work on a cache (or tailored front-end) for the snapshot.debian.org service. The general idea is that, when rebuilding Debian, you do not actually need the whole ~140TB of data from snapshot.debian.org; rather, only a very small subset of the packages are ever used for for building. It turns out, for amd64, arm64, armhf, i386, ppc64el, riscv64 and s390 for Debian trixie, unstable and experimental, this is only around 500GB ie. less than 1%. Although the new service not yet ready for usage, it has already provided a promising outlook in this regard. More information is available on https://rebuilder-snapshot.debian.net and we hope that this service becomes usable in the coming weeks. The adjacent picture shows a sticky note authored by Jan-Benedict Glaw at the summit in Hamburg, confirming Holger Levsen s theory that rebuilding all Debian packages needs a very small subset of packages, the text states that 69,200 packages (in Debian sid) list 24,850 packages in their .buildinfo files, in 8,0200 variations. This little piece of paper was the beginning of rebuilder-snapshot and is a direct outcome of the summit! The Reproducible Builds team would like to thank our event sponsors who include Mullvad VPN, openSUSE, Debian, Software Freedom Conservancy, Allotropia and Aspiration Tech.

Beyond Trusting FOSS presentation at SeaGL On November 4th, Vagrant Cascadian presented Beyond Trusting FOSS at SeaGL in Seattle, WA in the United States. Founded in 2013, SeaGL is a free, grassroots technical summit dedicated to spreading awareness and knowledge about free source software, hardware and culture. The summary of Vagrant s talk mentions that it will:
[ ] introduce the concepts of Reproducible Builds, including best practices for developing and releasing software, the tools available to help diagnose issues, and touch on progress towards solving decades-old deeply pervasive fundamental security issues Learn how to verify and demonstrate trust, rather than simply hoping everything is OK!
Germane to the contents of the talk, the slides for Vagrant s talk can be built reproducibly, resulting in a PDF with a SHA1 of cfde2f8a0b7e6ec9b85377eeac0661d728b70f34 when built on Debian bookworm and c21fab273232c550ce822c4b0d9988e6c49aa2c3 on Debian sid at the time of writing.

Human Factors in Software Supply Chain Security Marcel Fourn , Dominik Wermke, Sascha Fahl and Yasemin Acar have published an article in a Special Issue of the IEEE s Security & Privacy magazine. Entitled A Viewpoint on Human Factors in Software Supply Chain Security: A Research Agenda, the paper justifies the need for reproducible builds to reach developers and end-users specifically, and furthermore points out some under-researched topics that we have seen mentioned in interviews. An author pre-print of the article is available in PDF form.

Community updates On our mailing list this month:

openSUSE updates Bernhard M. Wiedemann has created a wiki page outlining an proposal to create a general-purpose Linux distribution which consists of 100% bit-reproducible packages albeit minus the embedded signature within RPM files. It would be based on openSUSE Tumbleweed or, if available, its Slowroll-variant. In addition, Bernhard posted another monthly update for his work elsewhere in openSUSE.

Ubuntu Launchpad now supports .buildinfo files Back in 2017, Steve Langasek filed a bug against Ubuntu s Launchpad code hosting platform to report that .changes files (artifacts of building Ubuntu and Debian packages) reference .buildinfo files that aren t actually exposed by Launchpad itself. This was causing issues when attempting to process .changes files with tools such as Lintian. However, it was noticed last month that, in early August of this year, Simon Quigley had resolved this issue, and .buildinfo files are now available from the Launchpad system.

PHP reproducibility updates There have been two updates from the PHP programming language this month. Firstly, the widely-deployed PHPUnit framework for the PHP programming language have recently released version 10.5.0, which introduces the inclusion of a composer.lock file, ensuring total reproducibility of the shipped binary file. Further details and the discussion that went into their particular implementation can be found on the associated GitHub pull request. In addition, the presentation Leveraging Nix in the PHP ecosystem has been given in late October at the PHP International Conference in Munich by Pol Dellaiera. While the video replay is not yet available, the (reproducible) presentation slides and speaker notes are available.

diffoscope changes diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes, including:
  • Improving DOS/MBR extraction by adding support for 7z. [ ]
  • Adding a missing RequiredToolNotFound import. [ ]
  • As a UI/UX improvement, try and avoid printing an extended traceback if diffoscope runs out of memory. [ ]
  • Mark diffoscope as stable on PyPI.org. [ ]
  • Uploading version 252 to Debian unstable. [ ]

Website updates A huge number of notes were added to our website that were taken at our recent Reproducible Builds Summit held between October 31st and November 2nd in Hamburg, Germany. In particular, a big thanks to Arnout Engelen, Bernhard M. Wiedemann, Daan De Meyer, Evangelos Ribeiro Tzaras, Holger Levsen and Orhun Parmaks z. In addition to this, a number of other changes were made, including:

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In October, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Track packages marked as Priority: important in a new package set. [ ][ ]
    • Stop scheduling packages that fail to build from source in bookworm [ ] and bullseye. [ ].
    • Add old releases dashboard link in web navigation. [ ]
    • Permit re-run of the pool_buildinfos script to be re-run for a specific year. [ ]
    • Grant jbglaw access to the osuosl4 node [ ][ ] along with lynxis [ ].
    • Increase RAM on the amd64 Ionos builders from 48 GiB to 64 GiB; thanks IONOS! [ ]
    • Move buster to archived suites. [ ][ ]
    • Reduce the number of arm64 architecture workers from 24 to 16 in order to improve stability [ ], reduce the workers for amd64 from 32 to 28 and, for i386, reduce from 12 down to 8 [ ].
    • Show the entire build history of each Debian package. [ ]
    • Stop scheduling already tested package/version combinations in Debian bookworm. [ ]
  • Snapshot service for rebuilders
    • Add an HTTP-based API endpoint. [ ][ ]
    • Add a Gunicorn instance to serve the HTTP API. [ ]
    • Add an NGINX config [ ][ ][ ][ ]
  • System-health:
    • Detect failures due to HTTP 503 Service Unavailable errors. [ ]
    • Detect failures to update package sets. [ ]
    • Detect unmet dependencies. (This usually occurs with builds of Debian live-build.) [ ]
  • Misc-related changes:
    • do install systemd-ommd on jenkins. [ ]
    • fix harmless typo in squid.conf for codethink04. [ ]
    • fixup: reproducible Debian: add gunicorn service to serve /api for rebuilder-snapshot.d.o. [ ]
    • Increase codethink04 s Squid cache_dir size setting to 16 GiB. [ ]
    • Don t install systemd-oomd as it unfortunately kills sshd [ ]
    • Use debootstrap from backports when commisioning nodes. [ ]
    • Add the live_build_debian_stretch_gnome, debsums-tests_buster and debsums-tests_buster jobs to the zombie list. [ ][ ]
    • Run jekyll build with the --watch argument when building the Reproducible Builds website. [ ]
    • Misc node maintenance. [ ][ ][ ]
Other changes were made as well, however, including Mattia Rizzolo fixing rc.local s Bash syntax so it can actually run [ ], commenting away some file cleanup code that is (potentially) deleting too much [ ] and fixing the html_brekages page for Debian package builds [ ]. Finally, diagnosed and submitted a patch to add a AddEncoding gzip .gz line to the tests.reproducible-builds.org Apache configuration so that Gzip files aren t re-compressed as Gzip which some clients can t deal with (as well as being a waste of time). [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

28 November 2023

Enrico Zini: Introducing Debusine

Abstract Debusine manages scheduling and distribution of Debian-related tasks (package build, lintian analysis, autopkgtest runs, etc.) to distributed worker machines. It is being developed by Freexian with the intention of giving people access to a range of pre-configured tools and workflows running on remote hardware. Freexian obtained STF funding for a substantial set of Debusine milestones, so development is happening on a clear schedule. We can present where we are and, we're going to be, and what we hope to bring to Debian with this work.

16 November 2023

Dimitri John Ledkov: Ubuntu 23.10 significantly reduces the installed kernel footprint


Photo by Pixabay
Ubuntu systems typically have up to 3 kernels installed, before they are auto-removed by apt on classic installs. Historically the installation was optimized for metered download size only. However, kernel size growth and usage no longer warrant such optimizations. During the 23.10 Mantic Minatour cycle, I led a coordinated effort across multiple teams to implement lots of optimizations that together achieved unprecedented install footprint improvements.

Given a typical install of 3 generic kernel ABIs in the default configuration on a regular-sized VM (2 CPU cores 8GB of RAM) the following metrics are achieved in Ubuntu 23.10 versus Ubuntu 22.04 LTS:

  • 2x less disk space used (1,417MB vs 2,940MB, including initrd)

  • 3x less peak RAM usage for the initrd boot (68MB vs 204MB)

  • 0.5x increase in download size (949MB vs 600MB)

  • 2.5x faster initrd generation (4.5s vs 11.3s)

  • approximately the same total time (103s vs 98s, hardware dependent)


For minimal cloud images that do not install either linux-firmware or modules extra the numbers are:

  • 1.3x less disk space used (548MB vs 742MB)

  • 2.2x less peak RAM usage for initrd boot (27MB vs 62MB)

  • 0.4x increase in download size (207MB vs 146MB)


Hopefully, the compromise of download size, relative to the disk space & initrd savings is a win for the majority of platforms and use cases. For users on extremely expensive and metered connections, the likely best saving is to receive air-gapped updates or skip updates.

This was achieved by precompressing kernel modules & firmware files with the maximum level of Zstd compression at package build time; making actual .deb files uncompressed; assembling the initrd using split cpio archives - uncompressed for the pre-compressed files, whilst compressing only the userspace portions of the initrd; enabling in-kernel module decompression support with matching kmod; fixing bugs in all of the above, and landing all of these things in time for the feature freeze. Whilst leveraging the experience and some of the design choices implementations we have already been shipping on Ubuntu Core. Some of these changes are backported to Jammy, but only enough to support smooth upgrades to Mantic and later. Complete gains are only possible to experience on Mantic and later.

The discovered bugs in kernel module loading code likely affect systems that use LoadPin LSM with kernel space module uncompression as used on ChromeOS systems. Hopefully, Kees Cook or other ChromeOS developers pick up the kernel fixes from the stable trees. Or you know, just use Ubuntu kernels as they do get fixes and features like these first.

The team that designed and delivered these changes is large: Benjamin Drung, Andrea Righi, Juerg Haefliger, Julian Andres Klode, Steve Langasek, Michael Hudson-Doyle, Robert Kratky, Adrien Nader, Tim Gardner, Roxana Nicolescu - and myself Dimitri John Ledkov ensuring the most optimal solution is implemented, everything lands on time, and even implementing portions of the final solution.

Hi, It's me, I am a Staff Engineer at Canonical and we are hiring https://canonical.com/careers.

Lots of additional technical details and benchmarks on a huge range of diverse hardware and architectures, and bikeshedding all the things below:

For questions and comments please post to Kernel section on Ubuntu Discourse.



13 November 2023

Freexian Collaborators: Monthly report about Debian Long Term Support, October 2023 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In October, 18 contributors have been paid to work on Debian LTS, their reports are available:
  • Adrian Bunk did 8.0h (out of 7.75h assigned and 10.0h from previous period), thus carrying over 9.75h to the next month.
  • Anton Gladky did 9.5h (out of 9.5h assigned and 5.5h from previous period), thus carrying over 5.5h to the next month.
  • Bastien Roucari s did 16.0h (out of 16.75h assigned and 1.0h from previous period), thus carrying over 1.75h to the next month.
  • Ben Hutchings did 8.0h (out of 17.75h assigned), thus carrying over 9.75h to the next month.
  • Chris Lamb did 17.0h (out of 17.75h assigned), thus carrying over 0.75h to the next month.
  • Emilio Pozuelo Monfort did 17.5h (out of 17.75h assigned), thus carrying over 0.25h to the next month.
  • Guilhem Moulin did 9.75h (out of 17.75h assigned), thus carrying over 8.0h to the next month.
  • Helmut Grohne did 1.5h (out of 10.0h assigned), thus carrying over 8.5h to the next month.
  • Lee Garrett did 10.75h (out of 17.75h assigned), thus carrying over 7.0h to the next month.
  • Markus Koschany did 30.0h (out of 30.0h assigned).
  • Ola Lundqvist did 4.0h (out of 0h assigned and 19.5h from previous period), thus carrying over 15.5h to the next month.
  • Roberto C. S nchez did 12.0h (out of 5.0h assigned and 7.0h from previous period).
  • Santiago Ruano Rinc n did 13.625h (out of 7.75h assigned and 8.25h from previous period), thus carrying over 2.375h to the next month.
  • Sean Whitton did 13.0h (out of 6.0h assigned and 7.0h from previous period).
  • Sylvain Beucler did 7.5h (out of 11.25h assigned and 6.5h from previous period), thus carrying over 10.25h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Tobias Frost did 16.0h (out of 9.25h assigned and 6.75h from previous period).
  • Utkarsh Gupta did 0.0h (out of 0.75h assigned and 17.0h from previous period), thus carrying over 17.75h to the next month.

Evolution of the situation In October, we have released 49 DLAs. Of particular note in the month of October, LTS contributor Chris Lamb issued DLA 3627-1 pertaining to Redis, the popular key-value database similar to Memcached, which was vulnerable to an authentication bypass vulnerability. Fixing this vulnerability involved dealing with a race condition that could allow another process an opportunity to establish an otherwise unauthorized connection. LTS contributor Markus Koschany was involved in the mitigation of CVE-2023-44487, which is a protocol-level vulnerability in the HTTP/2 protocol. The impacts within Debian involved multiple packages, across multiple releases, with multiple advisories being released (both DSA for stable and old-stable, and DLA for LTS). Markus reviewed patches and security updates prepared by other Debian developers, investigated reported regressions, provided patches for the aforementioned regressions, and issued several security updates as part of this. Additionally, as MariaDB 10.3 (the version originally included with Debian buster) passed end-of-life earlier this year, LTS contributor Emilio Pozuelo Monfort has begun investigating the feasibility of backporting MariaDB 10.11. The work is in early stages, with much testing and analysis remaining before a final decision can be made, as this only one of several available potential courses of action concerning MariaDB. Finally, LTS contributor Lee Garrett has invested considerable effort into the development the Functional Test Framework here. While so far only an initial version has been published, it already has several features which we intend to begin leveraging for testing of LTS packages. In particular, the FTF supports provisioning multiple VMs for the purposes of performing functional tests of network-facing services (e.g., file services, authentication, etc.). These tests are in addition to the various unit-level tests which are executed during package build time. Development work will continue on FTF and as it matures and begins to see wider use within LTS we expect to improve the quality of the updates we publish.

Thanks to our sponsors Sponsors that joined recently are in bold.

7 November 2023

Matthew Palmer: PostgreSQL Encryption: The Available Options

On an episode of Postgres FM, the hosts had a (very brief) discussion of data encryption in PostgreSQL. While Postgres FM is a podcast well worth a subscribe, the hosts aren t data security experts, and so as someone who builds a queryable database encryption system, I found the coverage to be somewhat lacking. I figured I d provide a more complete survey of the available options for PostgreSQL-related data encryption.

The Status Quo By default, when you install PostgreSQL, there is no data encryption at all. That means that anyone who gets access to any part of the system can read all the data they have access to. This is, of course, not peculiar to PostgreSQL: basically everything works much the same way. What s stopping an attacker from nicking off with all your data is the fact that they can t access the database at all. The things that are acting as protection are perimeter defences, like putting the physical equipment running the server in a secure datacenter, firewalls to prevent internet randos connecting to the database, and strong passwords. This is referred to as tortoise security it s tough on the outside, but soft on the inside. Once that outer shell is cracked, the delicious, delicious data is ripe for the picking, and there s absolutely nothing to stop a miscreant from going to town and making off with everything. It s a good idea to plan your defenses on the assumption you re going to get breached sooner or later. Having good defence-in-depth includes denying the attacker to your data even if they compromise the database. This is where encryption comes in.

Storage-Layer Defences: Disk / Volume Encryption To protect against the compromise of the storage that your database uses (physical disks, EBS volumes, and the like), it s common to employ encryption-at-rest, such as full-disk encryption, or volume encryption. These mechanisms protect against offline attacks, but provide no protection while the system is actually running. And therein lies the rub: your database is always running, so encryption at rest typically doesn t provide much value. If you re running physical systems, disk encryption is essential, but more to prevent accidental data loss, due to things like failing to wipe drives before disposing of them, rather than physical theft. In systems where volume encryption is only a tickbox away, it s also worth enabling, if only to prevent inane questions from your security auditors. Relying solely on storage-layer defences, though, is very unlikely to provide any appreciable value in preventing data loss.

Database-Layer Defences: Transparent Database Encryption If you ve used proprietary database systems in high-security environments, you might have come across Transparent Database Encryption (TDE). There are also a couple of proprietary extensions for PostgreSQL that provide this functionality. TDE is essentially encryption-at-rest implemented inside the database server. As such, it has much the same drawbacks as disk encryption: few real-world attacks are thwarted by it. There is a very small amount of additional protection, in that physical level backups (as produced by pg_basebackup) are protected, but the vast majority of attacks aren t stopped by TDE. Any attacker who can access the database while it s running can just ask for an SQL-level dump of the stored data, and they ll get the unencrypted data quick as you like.

Application-Layer Defences: Field Encryption If you want to take the database out of the threat landscape, you really need to encrypt sensitive data before it even gets near the database. This is the realm of field encryption, more commonly known as application-level encryption. This technique involves encrypting each field of data before it is sent to be stored in the database, and then decrypting it again after it s retrieved from the database. Anyone who gets the data from the database directly, whether via a backup or a direct connection, is out of luck: they can t decrypt the data, and therefore it s worthless. There are, of course, some limitations of this technique. For starters, every ORM and data mapper out there has rolled their own encryption format, meaning that there s basically zero interoperability. This isn t a problem if you build everything that accesses the database using a single framework, but if you ever feel the need to migrate, or use the database from multiple codebases, you re likely in for a rough time. The other big problem of traditional application-level encryption is that, when the database can t understand what data its storing, it can t run queries against that data. So if you want to encrypt, say, your users dates of birth, but you also need to be able to query on that field, you need to choose between one or the other: you can t have both at the same time. You may think to yourself, but this isn t any good, an attacker that breaks into my application can still steal all my data! . That is true, but security is never binary. The name of the game is reducing the attack surface, making it harder for an attacker to succeed. If you leave all the data unencrypted in the database, an attacker can steal all your data by breaking into the database or by breaking into the application. Encrypting the data reduces the attacker s options, and allows you to focus your resources on hardening the application against attack, safe in the knowledge that an attacker who gets into the database directly isn t going to get anything valuable.

Sidenote: The Curious Case of pg_crypto PostgreSQL ships a contrib module called pg_crypto, which provides encryption and decryption functions. This sounds ideal to use for encrypting data within our applications, as it s available no matter what we re using to write our application. It avoids the problem of framework-specific cryptography, because you call the same PostgreSQL functions no matter what language you re using, which produces the same output. However, I don t recommend ever using pg_crypto s data encryption functions, and I doubt you will find many other cryptographic engineers who will, either. First up, and most horrifyingly, it requires you to pass the long-term keys to the database server. If there s an attacker actively in the database server, they can capture the keys as they come in, which means all the data encrypted using that key is exposed. Sending the keys can also result in the keys ending up in query logs, both on the client and server, which is obviously a terrible result. Less scary, but still very concerning, is that pg_crypto s available cryptography is, to put it mildly, antiquated. We have a lot of newer, safer, and faster techniques for data encryption, that aren t available in pg_crypto. This means that if you do use it, you re leaving a lot on the table, and need to have skilled cryptographic engineers on hand to avoid the potential pitfalls. In short: friends don t let friends use pg_crypto.

The Future: Enquo All this brings us to the project I run: Enquo. It takes application-layer encryption to a new level, by providing a language- and framework-agnostic cryptosystem that also enables encrypted data to be efficiently queried by the database. So, you can encrypt your users dates of birth, in such a way that anyone with the appropriate keys can query the database to return, say, all users over the age of 18, but an attacker just sees unintelligible gibberish. This should greatly increase the amount of data that can be encrypted, and as the Enquo project expands its available data types and supported languages, the coverage of encrypted data will grow and grow. My eventual goal is to encrypt all data, all the time. If this appeals to you, visit enquo.org to use or contribute to the open source project, or EnquoDB.com for commercial support and hosted database options.

1 November 2023

Joachim Breitner: Joining the Lean FRO

Tomorrow is going to be a new first day in a new job for me: I am joining the Lean FRO, and I m excited.

What is Lean? Lean is the new kid on the block of theorem provers. It s a pure functional programming language (like Haskell, with and on which I have worked a lot), but it s dependently typed (which Haskell may be evolving to be as well, but rather slowly and carefully). It has a refreshing syntax, built on top of a rather good (I have been told, not an expert here) macro system. As a dependently typed programming language, it is also a theorem prover, or proof assistant, and there exists already a lively community of mathematicians who started to formalize mathematics in a coherent library, creatively called mathlib.

What is a FRO? A Focused Research Organization has the organizational form of a small start up (small team, little overhead, a few years of runway), but its goals and measure for success are not commercial, as funding is provided by donors (in the case of the Lean FRO, the Simons Foundation International, the Alfred P. Sloan Foundation, and Richard Merkin). This allows us to build something that we believe is a contribution for the greater good, even though it s not (or not yet) commercially interesting enough and does not fit other forms of funding (such as research grants) well. This is a very comfortable situation to be in.

Why am I excited? To me, working on Lean seems to be the perfect mix: I have been working on language implementation for about a decade now, and always with a preference for functional languages. Add to that my interest in theorem proving, where I have used Isabelle and Coq so far, and played with Agda and others. So technically, clearly up my alley. Furthermore, the language isn t too old, and plenty of interesting things are simply still to do, rather than tried before. The ecosystem is still evolving, so there is a good chance to have some impact. On the other hand, the language isn t too young either. It is no longer an open question whether we will have users: we have them already, they hang out on zulip, so if I improve something, there is likely someone going to be happy about it, which is great. And the community seems to be welcoming and full of nice people. Finally, this library of mathematics that these users are building is itself an amazing artifact: Lots of math in a consistent, machine-readable, maintained, documented, checked form! With a little bit of optimism I can imagine this changing how math research and education will be done in the future. It could be for math what Wikipedia is for encyclopedic knowledge and OpenStreetMap for maps and the thought of facilitating that excites me. With this new job I find that when I am telling friends and colleagues about it, I do not hesitate or hedge when asked why I am doing this. This is a good sign.

What will I be doing? We ll see what main tasks I ll get to tackle initially, but knowing myself, I expect I ll get broadly involved. To get up to speed I started playing around with a few things already, and for example created Loogle, a Mathlib search engine inspired by Haskell s Hoogle, including a Zulip bot integration. This seems to be useful and quite well received, so I ll continue maintaining that. Expect more about this and other contributions here in the future.

20 October 2023

Freexian Collaborators: Debian Contributions: Freexian meetup, debusine updates, lpr/lpd in Debian, and more! (by Utkarsh Gupta, Stefano Rivera)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

Freexian Meetup, by Stefano Rivera, Utkarsh Gupta, et al. During DebConf, Freexian organized a meetup for its collaborators and those interested in learning more about Freexian and its services. It was well received and many people interested in Freexian showed up. Some developers who were interested in contributing to LTS came to get more details about joining the project. And some prospective customers came to get to know us and ask questions. Sadly, the tragic loss of Abraham shook DebConf, both individually and structurally. The meetup got rescheduled to a small room without video coverage. With that, we still had a wholesome interaction and here s a quick picture from the meetup taken by Utkarsh (which is also why he s missing!).

Debusine, by Rapha l Hertzog, et al. Freexian has been investing into debusine for a while, but development speed is about to increase dramatically thanks to funding from SovereignTechFund.de. Rapha l laid out the 5 milestones of the funding contract, and filed the issues for the first milestone. Together with Enrico and Stefano, they established a workflow for the expanded team. Among the first steps of this milestone, Enrico started to work on a developer-friendly description of debusine that we can use when we reach out to the many Debian contributors that we will have to interact with. And Rapha l started the design work of the autopkgtest and lintian tasks, i.e. what s the interface to schedule such tasks, what behavior and what associated options do we support? At this point you might wonder what debusine is supposed to be let us try to answer this: Debusine manages scheduling and distribution of Debian-related build and QA tasks to a network of worker machines. It also manages the resulting artifacts and provides the results in an easy to consume way. We want to make it easy for Debian contributors to leverage all the great QA tools that Debian provides. We want to build the next generation of Debian s build infrastructure, one that will continue to reliably do what it already does, but that will also enable distribution-wide experiments, custom package repositories and custom workflows with advanced package reviews. If this all sounds interesting to you, don t hesitate to watch the project on salsa.debian.org and to contribute.

lpr/lpd in Debian, by Thorsten Alteholz During Debconf23, Till Kamppeter presented CPDB (Common Print Dialog Backend), a new way to handle print queues. After this talk it was discussed whether the old lpr/lpd based printing system could be abandoned in Debian or whether there is still demand for it. So Thorsten asked on the debian-devel email list whether anybody uses it. Oddly enough, these old packages are still useful:
  • Within a small network it is easier to distribute a printcap file, than to properly configure cups clients.
  • One of the biggest manufacturers of WLAN router and DSL boxes only supports raw queues when attaching an USB printer to their hardware. Admittedly the CPDB still has problems with such raw queues.
  • The Pharos printing system at MIT is still lpd-based.
As a result, the lpr/lpd stuff is not yet ready to be abandoned and Thorsten will adopt the relevant packages (or rather move them under the umbrella of the debian-printing team). Though it is not planned to develop new features, those packages should at least have a maintainer. This month Thorsten adopted rlpr, an utility for lpd printing without using /etc/printcap. The next one he is working on is lprng, a lpr/lpd printer spooling system. If you know of any other package that is also needed and still maintained by the QA team, please tell Thorsten.

/usr-merge, by Helmut Grohne Discussion about lifting the file move moratorium has been initiated with the CTTE and the release team. A formal lift is dependent on updating debootstrap in older suites though. A significant number of packages can automatically move their systemd unit files if dh_installsystemd and systemd.pc change their installation targets. Unfortunately, doing so makes some packages FTBFS and therefore patches have been filed. The analysis tool, dumat, has been enhanced to better understand which upgrade scenarios are considered supported to reduce false positive bug filings and gained a mode for local operation on a .changes file meant for inclusion in salsa-ci. The filing of bugs from dumat is still manual to improve the quality of reports. Since September, the moratorium has been lifted.

Miscellaneous contributions
  • Rapha l updated Django s backport in bullseye-backports to match the latest security release that was published in bookworm. Tracker.debian.org is still using that backport.
  • Helmut Grohne sent 13 patches for cross build failures.
  • Helmut Grohne performed a maintenance upload of debvm enabling its use in autopkgtests.
  • Helmut Grohne wrote an API-compatible reimplementation of autopkgtest-build-qemu. It is powered by mmdebstrap, therefore unprivileged, EFI-only and will soon be included in mmdebstrap.
  • Santiago continued the work regarding how to make it easier to (automatically) test reverse dependencies. An example of the ongoing work was presented during the Salsa CI BoF at DebConf 23.
    In fact, omniorb-dfsg test pipelines as the above were used for the omniorb-dfsg 4.3.0 transition, verifying how the reverse dependencies (tango, pytango and omnievents) were built and how their autopkgtest jobs run with the to-be-uploaded omniorb-dfsg new release.
  • Utkarsh and Stefano attended and helped run DebConf 23. Also continued winding up DebConf 22 accounting.
  • Anton Gladky did some science team uploads to fix RC bugs.

12 October 2023

Freexian Collaborators: Monthly report about Debian Long Term Support, September 2023 (by Santiago Ruano Rinc n)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In September, 21 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 10.0h (out of 0h assigned and 14.0h from previous period), thus carrying over 4.0h to the next month.
  • Adrian Bunk did 7.0h (out of 17.0h assigned), thus carrying over 10.0h to the next month.
  • Anton Gladky did 9.5h (out of 7.5h assigned and 7.5h from previous period), thus carrying over 5.5h to the next month.
  • Bastien Roucari s did 16.0h (out of 15.5h assigned and 1.5h from previous period), thus carrying over 1.0h to the next month.
  • Ben Hutchings did 17.0h (out of 17.0h assigned).
  • Chris Lamb did 17.0h (out of 17.0h assigned).
  • Emilio Pozuelo Monfort did 30.0h (out of 30.0h assigned).
  • Guilhem Moulin did 18.25h (out of 18.25h assigned).
  • Helmut Grohne did 10.0h (out of 10.0h assigned).
  • Lee Garrett did 17.0h (out of 16.5h assigned and 0.5h from previous period).
  • Markus Koschany did 40.0h (out of 40.0h assigned).
  • Ola Lundqvist did 4.5h (out of 0h assigned and 24.0h from previous period), thus carrying over 19.5h to the next month.
  • Roberto C. S nchez did 5.0h (out of 12.0h assigned), thus carrying over 7.0h to the next month.
  • Santiago Ruano Rinc n did 7.75h (out of 16.0h assigned), thus carrying over 8.25h to the next month.
  • Sean Whitton did 7.0h (out of 7.0h assigned).
  • Sylvain Beucler did 10.5h (out of 17.0h assigned), thus carrying over 6.5h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Tobias Frost did 13.25h (out of 16.0h assigned), thus carrying over 2.75h to the next month.

Evolution of the situation In September, we have released 44 DLAs. The month of September was a busy month for the LTS Team. A notable security issue fixed in September was the high-severity CVE-2023-4863, a heap buffer overflow that allowed remote attackers to perform an out-of-bounds memory write via a crafted WebP file. This CVE was covered by the three DLAs of different packages: firefox-esr, libwebp and thunderbird. The libwebp backported patch was sent to upstream, who adapted and applied it to the 0.6.1 branch. It is also worth noting that LTS contributor Markus Koschany included in his work updates to packages in Debian Bullseye and Bookworm, that are under the umbrella of the Security Team: xrdp, jetty9 and mosquitto. As every month, there was important behind-the-scenes work by the Front Desk staff, who triaged, analyzed and reviewed dozens of vulnerabilities, to decide if they warrant a security update. This is very important work, since we need to trade-off between the frequency of updates and the stability of the LTS release.

Thanks to our sponsors Sponsors that joined recently are in bold.

Next.

Previous.